We propose a new technique for modifying the time-scale of speech using Independent Subspace Analysis (ISA). To carry out ISA, the single channel mixture signal is converted to a time-frequency representation such as spectrogram. Here, the spectrogram is generated by taking Hartley or Wavelet transform on overlapped frames of speech. We do dimensionality reduction of the auto correlated original spectrogram using singular value decomposition. Then, we use Independent component analysis to get unmixing matrix using JadeICA algorithm . It is then assumed that the overall spectrogram results from the superposition of a number of unknown statistically independent spectrograms. By using unmixing matrix, independent sources such as temporal amplitude envelopes and frequency weights can be extracted from the spectrogram. Time-scaling of speech is carried out by resampling the independent temporal amplitude envelopes. We then obtain timescale independent spectrograms after multiplying the independent frequency weights with time-scaled temporal amplitude envelopes. Summing all these independent spectrograms and taking inverse Hartely or wavelet transform of the sum spectrogram to reconstruct and overlap-add the reconstructed time-domain signal to get the time-scaled speech. The quality of the time-scaled speech has been analyzed using Modified Bark Spectral Distortion (MBSD) . From the MBSD score, one can infer that the time-scaled signal is less distorted.
Added on August 5, 2014
Product Type : Research Paper
License Type : Freeware
System Requirement :
Author : R. Muralishankar,Lakshmish N. Kaushik,A G. Ramakrishnan