To test the system, the following test WAV audio file has been used. This is obtained from the one that can be found here licensed under CC0: the only modification to it were changing the bitdepth to 16 bit.
The system, parameters and filters defined in System have been used.
First, an URL is embedded. The following is used:
text_orig = https://example.com/123abcd/index.htmlThen, the audio that is obtained has been played using a laptop, captured using a smartphone and saved locally to perform tests. Six different situations have been considered:
- [1]: here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an outdoor environment.
- [2]: here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an outdoor environment.
- [3]: here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an outdoor environment.
- [4]: here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an indoor environment.
- [5]: here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an indoor environment.
- [6]: here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an indoor environment.
The reasons to consider both indoor and outdoor environment are that in the first the reflected waves have more impact, as can be seen from the tests. Also, for each scenario there is no loud external noise (in the outdoor recordings there is the sound of the wind, but not that loud).
All the sounds that are used in this page can be found in Sources, as part of a demo that can be run from the main Matlab script that has been written. In those files there are more bursts than the ones analysed in this page, and some of them couldn't be demodulated using the developed system (this is not always a problem, since another attempt could me made with a later burst): here are used, when possible, those from which the URL could be recovered.
The recordings appear very attenuated and there is some noise due to the microphone or speakers (using other devices the sound seems clearer), however, despite of this, the URL could be recovered for 5 cases out of 6. By using other devices, better results could be obtained.
Result of the Testing
In general, the following observations could be done from the results of the tests reported below:
- As expected, the system works better in an outdoor environment, as there are less reflected waves and distortions.
- In an indoor environment, the user must stay closer to the speakers, otherwise the URL could not be recovered. Maybe this can be improved by using better speakers or microphone, or by estimating the response of the channel (speakers, environment and microphone) to compensate the distortions.
- Sometimes the constellations of complex symbols after sampling and differential decoding are a bit noisy. However, except for case [6], the phases associated to bits 0 and 1 are well distinguishable, although varying over time when using ideal carrier (before differential decoding). As this is a phase modulation, and only this one matters, after sampling and differential decoding there are almost no errors.
- Except for case [6], the main cause of erroneous recovering of URL seems to be bad synchronization: the phases associated to 0 and 1 bit were still clearly distinguishable. So, a better synchronization of the preamble and maybe some timing recovery should be implemented.
- Doppler effect causes the constellations to rotate. However, if the user speeds are limited, this is compensated by differential decoding.
- In most of cases analysed here, almost no error is produced even after differential decoding, that is before Viterbi decoding. Sometimes there are very few errors, and Viterbi decoding is able to correct them all. There is no advantage in using Reed-Solomon codes in these test, so they may seem useless. In fact, they were added to the system because, during the development of it, sometimes there were some burst error to correct. These could be removed from the system, so that shorter messages are to be transmitted. However, if there are some loud noises or other things that causes burst errors, they may be useful: more tests should be done to see if convolutional codes are always enough to correct all errors (maybe increasing the constraint length).
- The three methods IC, RC and NC produces always the same results. However, the system should be tested in other conditions (loud noises, greater movements of the recording devices) to see if this is still true.
From URL to AUDIO
Message Domain: from text to bits
Firstly, text_orig is converted to a bit array. In the following table there are the length of the arrays obtained at each step:
Step | Length after this step | |
---|---|---|
PAYLOAD | Varicode Encoding | 141 |
RLE encoding | 122 | |
Zero-padding | 194 | |
Data CRC | 210 | |
Reed-Solomon encoding | 310 | |
Convolutional Encoding | 632 | |
HEADER | Number of Reed-Solomon blocks Encoding | 4 |
Header CRC | 8 | |
Header Repetitions Encoding | 40 | |
HEADER+PAYLOAD | Prepending header to payload | 672 |
Differential encoding | 673 | |
Prepending synchronization preamble | 807 | |
Prepending and appending extra pad bits | 867 |
In this case, as in text_orig there are 38 characters plus 1 null-terminating character, if 7-bit ASCII codes were used, 273 bits would be needed, instead of the 141 obtained after Varicode encoding (or 122 after RLE compression). The following table shows the compression ratios:
Compression ratio | |
---|---|
ASCII compared to Varicode | 1.94 |
Varicode compared to RLE | 1.16 |
ASCII compared to RLE | 2.24 |
Also, there are 2 Reed-Solomon blocks, so this is the length that is encoded in the header.
Signal Domain: from bits to PSK signal
Now the bits can be modulated: after bipolar encoding and interpolation, the pulse shaping SRRC filter is applied and the resulting signal is multiplied by the carrier, as shown in the figure below. Then, the final bandpass filter is applied.

The expected demodulator result, in ideal conditions, is shown in the following figure.

The final bandpass filtering has almost no visible effects on the waveform. However, it is performed anyway, as this is only done once and it lowers unwanted frequencies further, as can be seen from the Fourier transform computed after each step, reported below.
Lastly, normalization is performed. In the following table there are some information about the signal before and after this last phase.
Before MAX normalization | After MAX normalization | |
---|---|---|
RMS | 0.076 | 0.331 |
MAX | 0.161 | 0.700 |
Note that the RMS before normalization is very similar to the expected one that was precomputed in the initialization phase (NORM_RMS_EST = 0.077).
The result is m_psk_out and has a duration of 1.705 seconds. If you play it as a sound using proper speakers, you should hear nothing. However, you can detect the presence of the inserted messages by using any audio spectrum analyser (even a simple smartphone app will work): a peak around the carrier frequency will be shown.
Also, in the figure below the entire signal is reported and its main components highlighted.
Embedding the PSK signal into the original audio
The original stereo audio has the following spectrogram:
First of all, a pre-processing is done on this audio tracks: the bandstop filter is applied on each of its channels and after that normalization is performed. In the following table there are some information about the original audio before and after this last phase.
Before RMS normalization | After RMS normalization | |
---|---|---|
RMS of audio (LEFT CH) | 0.201 | 0.050 |
RMS of audio (RIGHT CH) | 0.196 | 0.049 |
RMS of audio (mean) | 0.198 | 0.050 |
MAX of audio (LEFT CH) | 0.965 | 0.243 |
MAX of audio (RIGHT CH) | 0.983 | 0.248 |
MAX of audio (TOT) | 0.983 | 0.248 |
As the duration of this is audio_t=56.54 seconds, the PSK message will be inserted into it every DELTA_T=10 sec for floor(audio_t/DELTA_T)=11 times (6 in the left channel and 5 in the right one, interleaved), as shown in the following figure. Every BPSK signal will not be reduced in amplitude, since constraints are fulfilled.
The information about the output audio with the embedded URL are shown in the following table:
RMS (mean) | 0.143 |
MAX (tot) | 0.869 |
The output stereo audio with the URL embedded has the spectrogram shown below. Here you can see the result of the stopband filtering (blue/violet parts around FC Hz) and insertion of the PSK messages (yellow parts around FC Hz).
Lastly, if you play the resulting audio signal, you should hear no difference from the original audio above. Again, proper speakers must be used and the volume must not be at the maximum, otherwise distortion may be hearable. Also, you can see the presence of the inserted messages by using any audio spectrum analyser.
From AUDIO back to URL
The system has been tested in the three situations mentioned above. Every time, REC_T seconds have been taken from a random position of the recorded audio files, to simulate a mic recording.
Scenario [1]
Here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an outdoor environment.
The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.
Then, the BPSK signal is detected as shown in the following figure.
After that, the RMS of the detected signal is 0.0027 and is normalized to NORM_RMS_EST = 0.077.
Then the preamble synchronization is performed, as shown in the following figures. The cross correlation is very similar to the expected one.
At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of a phase mismatch of the carrier. There is some noise, but symbols are always clearly distinguishable.

In the following figure, the in-phase and quadrature components of these symbols are shown over time. These are always distinguishable (in the (IC) after sampling there is some more noise).

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a little constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.
As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:
text_rec = https://example.com/123abcd/index.htmlComparison with expected data
Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:
IC | RC | NC | ||
---|---|---|---|---|
HEADER | After differential decoding | 0 (0%) | 0 (0%) | 0 (0%) |
After repetitions decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
PAYLOAD | After differential decoding | 0 (0%) | 0 (0%) | 0 (0%) |
After convolutional decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
After Reed-Solomon decoding | 0 (0%) | 0 (0%) | 0 (0%) |
Scenario [2]
Here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an outdoor environment.
The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.
Then, the BPSK signal is detected as shown in the following figure.
After that, the RMS of the detected signal is 0.0019 and is normalized to NORM_RMS_EST = 0.077.
Then the preamble synchronization is performed, as shown in the following figures. Here, the cross correlation is similar to the expected one.
At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of Doppler due to movement, but there isn't much noise. For the other cases, there isn't much noise too, and symbols are always clearly distinguishable.

In the following figure, the in-phase and quadrature components of these symbols are shown over time. These are always distinguishable. Also, in the (IC) after sampling it can be seen the consequence of Doppler effect.

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a time varying carrier frequency mismatch due to Doppler.

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.
As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:
text_rec = https://example.com/123abcd/index.htmlComparison with expected data
Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:
IC | RC | NC | ||
---|---|---|---|---|
HEADER | After differential decoding | 0 (0%) | 0 (0%) | 0 (0%) |
After repetitions decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
PAYLOAD | After differential decoding | 0 (0%) | 0 (0%) | 0 (0%) |
After convolutional decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
After Reed-Solomon decoding | 0 (0%) | 0 (0%) | 0 (0%) |
Scenario [3]
Here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an outdoor environment.
The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.
Then, the BPSK signal is detected as shown in the following figure.
After that, the RMS of the detected signal is 0.0018 and is normalized to NORM_RMS_EST = 0.077.
Then the preamble synchronization is performed, as shown in the following figures. Here the cross correlation is very similar to the expected one.
At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of mismatch of the frequency of the carrier. For other cases there is some noise, but symbols are always distinguishable.

In the following figure, the in-phase and quadrature components of these symbols are shown over time. Despite noisy, these are at most distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.
As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:
text_rec = https://example.com/123abcd/index.htmlComparison with expected data
Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:
IC | RC | NC | ||
---|---|---|---|---|
HEADER | After differential decoding | 0 (0%) | 0 (0%) | 0 (0%) |
After repetitions decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
PAYLOAD | After differential decoding | 1 (0.16%) | 1 (0.16%) | 0 (0%) |
After convolutional decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
After Reed-Solomon decoding | 0 (0%) | 0 (0%) | 0 (0%) |
Scenario [4]
Here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an indoor environment.
The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.
Then, the BPSK signal is detected as shown in the following figure.
After that, the RMS of the detected signal is 0.0023 and is normalized to NORM_RMS_EST = 0.077.
Then the preamble synchronization is performed, as shown in the following figures. Here the cross correlation is similar to the expected one, but there are some other peaks, due to reflection of waves.
At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of mismatch of the frequency of the carrier. For other cases there is some noise, but symbols are at most distinguishable.

In the following figure, the in-phase and quadrature components of these symbols are shown over time. Despite noisy, these are a bit distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.
As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:
text_rec = https://example.com/123abcd/index.htmlComparison with expected data
Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:
IC | RC | NC | ||
---|---|---|---|---|
HEADER | After differential decoding | 2 (5%) | 2 (5%) | 2 (5%) |
After repetitions decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
PAYLOAD | After differential decoding | 2 (0.32%) | 2 (0.32%) | 2 (0.32%) |
After convolutional decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
After Reed-Solomon decoding | 0 (0%) | 0 (0%) | 0 (0%) |
Scenario [5]
Here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an indoor environment.
The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.
Then, the BPSK signal is detected as shown in the following figure.
After that, the RMS of the detected signal is 0.004 and is normalized to NORM_RMS_EST = 0.077.
Then the preamble synchronization is performed, as shown in the following figures. Here the cross correlation is similar to the expected one, but there are some other peaks, due to reflection of waves.
At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of Doppler due to movement, and noise. For the other cases, there is some noise too, but symbols are at most distinguishable.

In the following figure, the in-phase and quadrature components of these symbols are shown over time. Despite noisy, these are a bit distinguishable. Also, in the (IC) after sampling it can be seen the consequence of Doppler effect.

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a time varying carrier frequency mismatch due to Doppler.

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.
As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:
text_rec = https://example.com/123abcd/index.htmlComparison with expected data
Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:
IC | RC | NC | ||
---|---|---|---|---|
HEADER | After differential decoding | 0 (0%) | 0 (0%) | 0 (0%) |
After repetitions decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
PAYLOAD | After differential decoding | 6 (0.95%) | 6 (0.95%) | 5 (0.80%) |
After convolutional decoding | 0 (0%) | 0 (0%) | 0 (0%) | |
After Reed-Solomon decoding | 0 (0%) | 0 (0%) | 0 (0%) |
Scenario [6]
Here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an indoor environment.
The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.
Then, the BPSK signal is detected as shown in the following figure.
After that, the RMS of the detected signal is 0.019 and is normalized to NORM_RMS_EST = 0.077.
Then the preamble synchronization is performed, as shown in the following figures. Here the cross correlation is similar to the expected one, but there are some other peaks, due to reflection of waves.
At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. Here there is much noise.

In the following figure, the in-phase and quadrature components of these symbols are shown over time. These are very noisy and never distinguishable.

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are almost never distinguishable (except when there is a runlength of the same symbol).

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.
In this case CRC check fails and the URL can't be recovered.
Comparison with expected data
Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:
IC | RC | NC | ||
---|---|---|---|---|
HEADER | After differential decoding | 20 (50%) | 22 (55%) | 21 (52.5%) |
After repetitions decoding | 5 (62.5%) | 5 (62.5%) | 4 (50%) | |
PAYLOAD | After differential decoding | 218 (34.49%) | 210 (33.22%) | 185 (29.27%) |
After convolutional decoding | 125 (40.32%) | 118 (38.07%) | 107 (34.52%) | |
After Reed-Solomon decoding | 70 (33.33%) | 71 (33.81%) | 59 (28.10%) |
Also, in the following figures there are the distribution of errors over time for each method and stage (0 means that the bit is correct, +1 that the bit is 0 instead of 1, -1 that the bit is 1 instead of 0).