audioURL - Testing

To test the system, the following test WAV audio file has been used. This is obtained from the one that can be found here licensed under CC0: the only modification to it were changing the bitdepth to 16 bit.

Original sample audio signal in which the URL has to be embedded.

The system, parameters and filters defined in System have been used.

First, an URL is embedded. The following is used:

text_orig = https://example.com/123abcd/index.html

Then, the audio that is obtained has been played using a laptop, captured using a smartphone and saved locally to perform tests. Six different situations have been considered:

[1]: here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an outdoor environment.
[2]: here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an outdoor environment.
[3]: here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an outdoor environment.
[4]: here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an indoor environment.
[5]: here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an indoor environment.
[6]: here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an indoor environment.

The reasons to consider both indoor and outdoor environment are that in the first the reflected waves have more impact, as can be seen from the tests. Also, for each scenario there is no loud external noise (in the outdoor recordings there is the sound of the wind, but not that loud).

All the sounds that are used in this page can be found in Sources, as part of a demo that can be run from the main Matlab script that has been written. In those files there are more bursts than the ones analysed in this page, and some of them couldn't be demodulated using the developed system (this is not always a problem, since another attempt could me made with a later burst): here are used, when possible, those from which the URL could be recovered.

The recordings appear very attenuated and there is some noise due to the microphone or speakers (using other devices the sound seems clearer), however, despite of this, the URL could be recovered for 5 cases out of 6. By using other devices, better results could be obtained.

Result of the Testing

In general, the following observations could be done from the results of the tests reported below:

As expected, the system works better in an outdoor environment, as there are less reflected waves and distortions.
In an indoor environment, the user must stay closer to the speakers, otherwise the URL could not be recovered. Maybe this can be improved by using better speakers or microphone, or by estimating the response of the channel (speakers, environment and microphone) to compensate the distortions.
Sometimes the constellations of complex symbols after sampling and differential decoding are a bit noisy. However, except for case [6], the phases associated to bits 0 and 1 are well distinguishable, although varying over time when using ideal carrier (before differential decoding). As this is a phase modulation, and only this one matters, after sampling and differential decoding there are almost no errors.
Except for case [6], the main cause of erroneous recovering of URL seems to be bad synchronization: the phases associated to 0 and 1 bit were still clearly distinguishable. So, a better synchronization of the preamble and maybe some timing recovery should be implemented.
Doppler effect causes the constellations to rotate. However, if the user speeds are limited, this is compensated by differential decoding.
In most of cases analysed here, almost no error is produced even after differential decoding, that is before Viterbi decoding. Sometimes there are very few errors, and Viterbi decoding is able to correct them all. There is no advantage in using Reed-Solomon codes in these test, so they may seem useless. In fact, they were added to the system because, during the development of it, sometimes there were some burst error to correct. These could be removed from the system, so that shorter messages are to be transmitted. However, if there are some loud noises or other things that causes burst errors, they may be useful: more tests should be done to see if convolutional codes are always enough to correct all errors (maybe increasing the constraint length).
The three methods IC, RC and NC produces always the same results. However, the system should be tested in other conditions (loud noises, greater movements of the recording devices) to see if this is still true.

From URL to AUDIO

Message Domain: from text to bits

Firstly, text_orig is converted to a bit array. In the following table there are the length of the arrays obtained at each step:

	Step	Length after this step
PAYLOAD	Varicode Encoding	141
	RLE encoding	122
	Zero-padding	194
	Data CRC	210
	Reed-Solomon encoding	310
	Convolutional Encoding	632
HEADER	Number of Reed-Solomon blocks Encoding	4
	Header CRC	8
	Header Repetitions Encoding	40
HEADER+PAYLOAD	Prepending header to payload	672
	Differential encoding	673
	Prepending synchronization preamble	807
	Prepending and appending extra pad bits	867

In this case, as in text_orig there are 38 characters plus 1 null-terminating character, if 7-bit ASCII codes were used, 273 bits would be needed, instead of the 141 obtained after Varicode encoding (or 122 after RLE compression). The following table shows the compression ratios:

	Compression ratio
ASCII compared to Varicode	1.94
Varicode compared to RLE	1.16
ASCII compared to RLE	2.24

Also, there are 2 Reed-Solomon blocks, so this is the length that is encoded in the header.

Signal Domain: from bits to PSK signal

Now the bits can be modulated: after bipolar encoding and interpolation, the pulse shaping SRRC filter is applied and the resulting signal is multiplied by the carrier, as shown in the figure below. Then, the final bandpass filter is applied.

Modulation of the bits to transmit (pulse shaping and carrier multiplication). Only a limited time interval is shown.

The expected demodulator result, in ideal conditions, is shown in the following figure.

Expected demodulation of the bits to transmit (carrier multiplication, matched filter and sampling). Only a limited time interval is shown.

The final bandpass filtering has almost no visible effects on the waveform. However, it is performed anyway, as this is only done once and it lowers unwanted frequencies further, as can be seen from the Fourier transform computed after each step, reported below.

Fourier transforms after some stages of the modulation process

Lastly, normalization is performed. In the following table there are some information about the signal before and after this last phase.

	Before MAX normalization	After MAX normalization
RMS	0.076	0.331
MAX	0.161	0.700

Note that the RMS before normalization is very similar to the expected one that was precomputed in the initialization phase (NORM_RMS_EST = 0.077).

The result is m_psk_out and has a duration of 1.705 seconds. If you play it as a sound using proper speakers, you should hear nothing. However, you can detect the presence of the inserted messages by using any audio spectrum analyser (even a simple smartphone app will work): a peak around the carrier frequency will be shown.

Normalized bandpass filtered PSK audio signal (m_psk_out) obtained by coding text_orig. WARNING: note that, despite this should be inaudible, it has high amplitude. Although digital clipping does not occur, do not use high volumes to avoid making the inserted messages audible and damaging speakers due to the saturation of their amplifiers.

Also, in the figure below the entire signal is reported and its main components highlighted.

Resultant BPSK signal that encodes the URL. Its main components are highlighted

Embedding the PSK signal into the original audio

The original stereo audio has the following spectrogram:

Spectrogram of the original audio in which the URL has to be embedded.

First of all, a pre-processing is done on this audio tracks: the bandstop filter is applied on each of its channels and after that normalization is performed. In the following table there are some information about the original audio before and after this last phase.

	Before RMS normalization	After RMS normalization
RMS of audio (LEFT CH)	0.201	0.050
RMS of audio (RIGHT CH)	0.196	0.049
RMS of audio (mean)	0.198	0.050
MAX of audio (LEFT CH)	0.965	0.243
MAX of audio (RIGHT CH)	0.983	0.248
MAX of audio (TOT)	0.983	0.248

As the duration of this is audio_t=56.54 seconds, the PSK message will be inserted into it every DELTA_T=10 sec for floor(audio_t/DELTA_T)=11 times (6 in the left channel and 5 in the right one, interleaved), as shown in the following figure. Every BPSK signal will not be reduced in amplitude, since constraints are fulfilled.

Insertion of BPSK messages into original audio to produce the output

The information about the output audio with the embedded URL are shown in the following table:

RMS (mean)	0.143
MAX (tot)	0.869

The output stereo audio with the URL embedded has the spectrogram shown below. Here you can see the result of the stopband filtering (blue/violet parts around FC Hz) and insertion of the PSK messages (yellow parts around FC Hz).

Spectrogram of the output audio with the URL embedded.

Lastly, if you play the resulting audio signal, you should hear no difference from the original audio above. Again, proper speakers must be used and the volume must not be at the maximum, otherwise distortion may be hearable. Also, you can see the presence of the inserted messages by using any audio spectrum analyser.

Audio signal obtained by embedding m_psk_out in the original audio. WARNING: note that, despite the inserted audio messages are inaudible, they have high amplitude compared to the audible parts. Although digital clipping does not occur, do not use high volumes to avoid making the inserted messages audible and damaging speakers due to the saturation of their amplifiers.

From AUDIO back to URL

The system has been tested in the three situations mentioned above. Every time, REC_T seconds have been taken from a random position of the recorded audio files, to simulate a mic recording.

Scenario [1]

Here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an outdoor environment.

The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.

SCENARIO [1]: Spectrogram of recorded audio and result of bandpass filtering

Then, the BPSK signal is detected as shown in the following figure.

SCENARIO [1]: BPSK signal detection

After that, the RMS of the detected signal is 0.0027 and is normalized to NORM_RMS_EST = 0.077.

Then the preamble synchronization is performed, as shown in the following figures. The cross correlation is very similar to the expected one.

SCENARIO [1]: Preamble notch filtering

SCENARIO [1]: Preamble synchronization

At this point, the signal is demodulated (carrier recovery, carrier multiplication, matched lowpass filter and sampling) using each of the three methods IC, RC and NC. The results are shown in the following figure (only a limited time interval is shown).

SCENARIO [1]: Demodulation results over a limited time interval for each of the three method IC, RC and NC. Both in-phase and quadrature components are shown. Blue lines are the results of carrier multiplications, red line are the matched filtered signals and yellow dots are the sampled data

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of a phase mismatch of the carrier. There is some noise, but symbols are always clearly distinguishable.

SCENARIO [1]: constellation of sampled symbols over time, both after sampling and after differential decoding. Red dots are the expected symbols

In the following figure, the in-phase and quadrature components of these symbols are shown over time. These are always distinguishable (in the (IC) after sampling there is some more noise).

SCENARIO [1]: in-phase and quadrature components of sampled symbols over time, both after sampling and after differential decoding

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a little constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

SCENARIO [1]: phase of sampled symbols over time, both after sampling and after differential decoding

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.

As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:

text_rec = https://example.com/123abcd/index.html

Comparison with expected data

Lastly, to test the last stages, the errors at each of them can be computed (original expected data are saved in another file when embedding the URL). In the following table the number of errors is reported for each of the last steps:

		IC	RC	NC
HEADER	After differential decoding	0 (0%)	0 (0%)	0 (0%)
HEADER	After repetitions decoding	0 (0%)	0 (0%)	0 (0%)
PAYLOAD	After differential decoding	0 (0%)	0 (0%)	0 (0%)
	After convolutional decoding	0 (0%)	0 (0%)	0 (0%)
	After Reed-Solomon decoding	0 (0%)	0 (0%)	0 (0%)

Scenario [2]

Here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an outdoor environment.

The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.

SCENARIO [2]: Spectrogram of recorded audio and result of bandpass filtering

Then, the BPSK signal is detected as shown in the following figure.

SCENARIO [2]: BPSK signal detection

After that, the RMS of the detected signal is 0.0019 and is normalized to NORM_RMS_EST = 0.077.

Then the preamble synchronization is performed, as shown in the following figures. Here, the cross correlation is similar to the expected one.

SCENARIO [2]: Preamble notch filtering

SCENARIO [2]: Preamble synchronization

SCENARIO [2]: Demodulation results over a limited time interval for each of the three method IC, RC and NC. Both in-phase and quadrature components are shown. Blue lines are the results of carrier multiplications, red line are the matched filtered signals and yellow dots are the sampled data

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of Doppler due to movement, but there isn't much noise. For the other cases, there isn't much noise too, and symbols are always clearly distinguishable.

SCENARIO [2]: constellation of sampled symbols over time, both after sampling and after differential decoding. Red dots are the expected symbols

In the following figure, the in-phase and quadrature components of these symbols are shown over time. These are always distinguishable. Also, in the (IC) after sampling it can be seen the consequence of Doppler effect.

SCENARIO [2]: in-phase and quadrature components of sampled symbols over time, both after sampling and after differential decoding

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a time varying carrier frequency mismatch due to Doppler.

SCENARIO [2]: phase of sampled symbols over time, both after sampling and after differential decoding

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.

As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:

text_rec = https://example.com/123abcd/index.html

Comparison with expected data

		IC	RC	NC
HEADER	After differential decoding	0 (0%)	0 (0%)	0 (0%)
HEADER	After repetitions decoding	0 (0%)	0 (0%)	0 (0%)
PAYLOAD	After differential decoding	0 (0%)	0 (0%)	0 (0%)
	After convolutional decoding	0 (0%)	0 (0%)	0 (0%)
	After Reed-Solomon decoding	0 (0%)	0 (0%)	0 (0%)

Scenario [3]

Here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an outdoor environment.

The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.

SCENARIO [3]: Spectrogram of recorded audio and result of bandpass filtering

Then, the BPSK signal is detected as shown in the following figure.

SCENARIO [3]: BPSK signal detection

After that, the RMS of the detected signal is 0.0018 and is normalized to NORM_RMS_EST = 0.077.

Then the preamble synchronization is performed, as shown in the following figures. Here the cross correlation is very similar to the expected one.

SCENARIO [3]: Preamble notch filtering

SCENARIO [3]: Preamble synchronization

SCENARIO [3]: Demodulation results over a limited time interval for each of the three method IC, RC and NC. Both in-phase and quadrature components are shown. Blue lines are the results of carrier multiplications, red line are the matched filtered signals and yellow dots are the sampled data

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of mismatch of the frequency of the carrier. For other cases there is some noise, but symbols are always distinguishable.

SCENARIO [3]: constellation of sampled symbols over time, both after sampling and after differential decoding. Red dots are the expected symbols

In the following figure, the in-phase and quadrature components of these symbols are shown over time. Despite noisy, these are at most distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

SCENARIO [3]: in-phase and quadrature components of sampled symbols over time, both after sampling and after differential decoding

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

SCENARIO [3]: phase of sampled symbols over time, both after sampling and after differential decoding

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.

As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:

text_rec = https://example.com/123abcd/index.html

Comparison with expected data

		IC	RC
HEADER	After differential decoding	0 (0%)	0 (0%)
HEADER	After repetitions decoding	0 (0%)	0 (0%)
PAYLOAD	After differential decoding	1 (0.16%)	1 (0.16%)
	After convolutional decoding	0 (0%)	0 (0%)
	After Reed-Solomon decoding	0 (0%)	0 (0%)

Scenario [4]

Here the capture was made close to the speakers (about 1 meter away), keeping the device still, in an indoor environment.

The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.

SCENARIO [4]: Spectrogram of recorded audio and result of bandpass filtering

Then, the BPSK signal is detected as shown in the following figure.

SCENARIO [4]: BPSK signal detection

After that, the RMS of the detected signal is 0.0023 and is normalized to NORM_RMS_EST = 0.077.

Then the preamble synchronization is performed, as shown in the following figures. Here the cross correlation is similar to the expected one, but there are some other peaks, due to reflection of waves.

SCENARIO [4]: Preamble notch filtering

SCENARIO [4]: Preamble synchronization

SCENARIO [4]: Demodulation results over a limited time interval for each of the three method IC, RC and NC. Both in-phase and quadrature components are shown. Blue lines are the results of carrier multiplications, red line are the matched filtered signals and yellow dots are the sampled data

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of mismatch of the frequency of the carrier. For other cases there is some noise, but symbols are at most distinguishable.

SCENARIO [4]: constellation of sampled symbols over time, both after sampling and after differential decoding. Red dots are the expected symbols

In the following figure, the in-phase and quadrature components of these symbols are shown over time. Despite noisy, these are a bit distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

SCENARIO [4]: in-phase and quadrature components of sampled symbols over time, both after sampling and after differential decoding

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a constant carrier frequency mismatch (maybe due to little movement of the smartphone during the recording).

SCENARIO [4]: phase of sampled symbols over time, both after sampling and after differential decoding

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.

As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:

text_rec = https://example.com/123abcd/index.html

Comparison with expected data

		IC	RC	NC
HEADER	After differential decoding	2 (5%)	2 (5%)	2 (5%)
HEADER	After repetitions decoding	0 (0%)	0 (0%)	0 (0%)
PAYLOAD	After differential decoding	2 (0.32%)	2 (0.32%)	2 (0.32%)
	After convolutional decoding	0 (0%)	0 (0%)	0 (0%)
	After Reed-Solomon decoding	0 (0%)	0 (0%)	0 (0%)

Scenario [5]

Here the capture was made close to the speakers (about 1 meter away), moving a bit the device (to accentuate Doppler effect), in an indoor environment.

The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.

SCENARIO [5]: Spectrogram of recorded audio and result of bandpass filtering

Then, the BPSK signal is detected as shown in the following figure.

SCENARIO [5]: BPSK signal detection

After that, the RMS of the detected signal is 0.004 and is normalized to NORM_RMS_EST = 0.077.

SCENARIO [5]: Preamble notch filtering

SCENARIO [5]: Preamble synchronization

SCENARIO [5]: Demodulation results over a limited time interval for each of the three method IC, RC and NC. Both in-phase and quadrature components are shown. Blue lines are the results of carrier multiplications, red line are the matched filtered signals and yellow dots are the sampled data

After removing the extra samples in the beginning, the symbols are differentially decoded. In the following figure the constellation plot is shown for each of the three methods, before and after this. In the (IC) after sampling figure it can be seen the effect of Doppler due to movement, and noise. For the other cases, there is some noise too, but symbols are at most distinguishable.

SCENARIO [5]: constellation of sampled symbols over time, both after sampling and after differential decoding. Red dots are the expected symbols

SCENARIO [5]: in-phase and quadrature components of sampled symbols over time, both after sampling and after differential decoding

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are always distinguishable. Also, in the (IC) after sampling it can be seen the effect of a time varying carrier frequency mismatch due to Doppler.

SCENARIO [5]: phase of sampled symbols over time, both after sampling and after differential decoding

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.

As no errors are detected using CRC check, the RLE and Varicode decoding are performed: the text is recovered and is equal to the original:

text_rec = https://example.com/123abcd/index.html

Comparison with expected data

		IC	RC	NC
HEADER	After differential decoding	0 (0%)	0 (0%)	0 (0%)
HEADER	After repetitions decoding	0 (0%)	0 (0%)	0 (0%)
PAYLOAD	After differential decoding	6 (0.95%)	6 (0.95%)	5 (0.80%)
	After convolutional decoding	0 (0%)	0 (0%)	0 (0%)
	After Reed-Solomon decoding	0 (0%)	0 (0%)	0 (0%)

Scenario [6]

Here the capture was made not close to the speakers (about 3 meters away), keeping the device still, in an indoor environment.

The spectrogram of the recorded audio and the result of the bandpass filtering is shown in the following figure.

SCENARIO [6]: Spectrogram of recorded audio and result of bandpass filtering

Then, the BPSK signal is detected as shown in the following figure.

SCENARIO [6]: BPSK signal detection

After that, the RMS of the detected signal is 0.019 and is normalized to NORM_RMS_EST = 0.077.

SCENARIO [6]: Preamble notch filtering

SCENARIO [6]: Preamble synchronization

SCENARIO [6]: Demodulation results over a limited time interval for each of the three method IC, RC and NC. Both in-phase and quadrature components are shown. Blue lines are the results of carrier multiplications, red line are the matched filtered signals and yellow dots are the sampled data

SCENARIO [6]: constellation of sampled symbols over time, both after sampling and after differential decoding. Red dots are the expected symbols

In the following figure, the in-phase and quadrature components of these symbols are shown over time. These are very noisy and never distinguishable.

SCENARIO [6]: in-phase and quadrature components of sampled symbols over time, both after sampling and after differential decoding

Also, as it is a phase modulation, in the following figure phase of these symbols are shown over time: they are almost never distinguishable (except when there is a runlength of the same symbol).

SCENARIO [6]: phase of sampled symbols over time, both after sampling and after differential decoding

After that, data are hard decoded into bits. Then the header is decoded to get the length of the payload and this one is decoded from convolutional and Reed-Solomon codes.

In this case CRC check fails and the URL can't be recovered.

Comparison with expected data

		IC	RC	NC
HEADER	After differential decoding	20 (50%)	22 (55%)	21 (52.5%)
HEADER	After repetitions decoding	5 (62.5%)	5 (62.5%)	4 (50%)
PAYLOAD	After differential decoding	218 (34.49%)	210 (33.22%)	185 (29.27%)
	After convolutional decoding	125 (40.32%)	118 (38.07%)	107 (34.52%)
	After Reed-Solomon decoding	70 (33.33%)	71 (33.81%)	59 (28.10%)

Also, in the following figures there are the distribution of errors over time for each method and stage (0 means that the bit is correct, +1 that the bit is 0 instead of 1, -1 that the bit is 1 instead of 0).

SCENARIO [6]: errors over time at each step with method IC

SCENARIO [6]: errors over time at each step with method RC

SCENARIO [6]: errors over time at each step with method NC