Perceptual Data Embedding in Audio and Speech Signals
thesisposted on 23.05.2021, 17:01 by Libo Zhang
Perceptual embedding is a technique to embed extra information into multimedia signals without fidelity degradation, which is the core of many applications including watermarking and data hiding. Perceptual embedding can be viewed as a telecommunication to transmit the embedded information over the medium consisting of the host signal. This observation divides the current embedding techniques into two categories, i.e. the host-suppressing ones like the quantization-based Quantization Index Modulation (QIM) and Scalar Costa Scheme (SCS), and the non host-suppressing ones like the conventional Spread Spectrum (SS) technique. The former class has significant advantages over the latter in robustness and data rate due to significantly reduced noise levels. In this research, the conventional SS embedding technique is modified such that it can suppress the host impact mostly. Both the theoretical analysis and simulations show that the modification significantly improve the performance of the conventional scheme and further, outperform the QIM and SCS under the case of watermarking where the attacks can be expected to be very strong. To further increase the robustness and embedding rate, measures like frequency masking effects of the Human Masking Auditory system and Forward Error Correction schemes are employed, such as Turbo code. The second part of this research explores the possibility of high-capacity embedding in telephony speech signals. Another modification to improve the embedding rate is proposed for the conventional SS scheme under weak attacks, which are expected for the case of data embedding.