Perceptual Hashing Algorithm For Speech Content Identification Based On Spectrum Entropy In Compressed Domain

This paper proposes a new perceptual hashing algorithm for speech content identification with compressed domain based on MDCT (Modified Discrete Cosine Transform) Spectrum Entropy. It aims primarily to solve problems of large computational complexity and poor real-time performance that appear when applying traditional identification methods to the compressed speeches. The process begins by extracting the MDCT coefficients, which are the intermediately decoded results of compressed speeches in MP3 format. In order to reduce the computational complexity, these coefficients are divided into sub-bands and the energy of MDCT spectrum is then calculated. Subbands of MDCT spectrum energy are then mapped to a similar mass function in information entropy theory. The function will be used as a perceptual feature and set to extract binary hash values. Experimental results show that the proposed algorithm keeps greater robustness to content-preserving operations while also maintaining efficiency. As a result of the partial decoding process, the real-time performance can meet the requirements of applications in real-time communication terminals.

Lingua:: Inglese

Frequenza di pubblicazione:: 1 volte all'anno
Argomenti della rivista:: Ingegneria, Introduzioni e rassegna, Ingegneria, altro

Feed RSS della rivista

Perceptual Hashing Algorithm For Speech Content Identification Based On Spectrum Entropy In Compressed Domain

Zhang Qiu-yu

Liu Yang-wei

Huang Yi-bo

Xing Peng-fei

Yang Zhong-ping

Pubblicato online: 10 mar 2014

Pagine: 283 - 300

Ricevuto: 05 nov 2013

Accettato: 08 feb 2014

DOI: https://doi.org/10.21307/ijssis-2017-656

Parole chiavePerceptual speech hashing algorithm, Spectrum entropy, Modified discrete cosine transform, Compressed domain

© 2014 Zhang Qiu-yu et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Parole chiave
Perceptual speech hashing algorithm, Spectrum entropy, Modified discrete cosine transform, Compressed domain