International Journal of Applied Science and Engineering
Published by Chaoyang University of Technology

Narzillo Mamatov*, Nilufar Niyozmatova, Abdurashid Samijonov

Tashkent University of Information Technologies named after Al-Kharezmi, Tashkent, Uzbekistan


Download Citation: |
Download PDF


ABSTRACT


One of the most important tasks of modern science is the development of software tools for human communication with devices (for example, a computer) in natural language, where speech input and output of information is carried out in the most user-friendly way. To create such tools, it is required to solve speech recognition problems. On the basis of many experimental studies, it can be concluded that the quality of speech recognition depends on the results of preliminary signal processing. Improving the quality of speech recognition requires new efficient and high-speed signal preprocessing methods and algorithms.
This article proposes a new approach and algorithm for the formation of signs of speech signals. Based on these features obtained by the proposed algorithm, the identification problem is solved. The article also provides a description of the software module for each stage of preprocessing of speech signals. The developed software is a voice-based identification tool.


Keywords: Algorithm, Signal, Speech signal, Filter, MFCC, PLP, LPCC.


Share this article with your colleagues

 


REFERENCES


  1. Chakroborty, S., Roy, A., Saha, G. 2006. Fusion of a complementary feature set with MFCC for improved closed set text-independent speaker identification. In: IEEE International Conference on Industrial Technology, ICIT 2006. 387–390.

  2. Chu, S., Narayanan, S., Kuo, C.C. 2008. Environmental sound recognition using MP-based features. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008. IEEE, 1–4.

  3. El Choubassi, M.M., El Khoury, H.E., Alagha, C.E.J., Skaf, J.A., Al-Alaoui, M.A. 2003. Arabic speech recognition using recurrent neural networks. In: Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795). Ieee, 543–547. DOI: 10.1109/ISSPIT.2003.1341178.

  4. FFmpeg. https://ffmpeg.org/ffmpeg-filters.html#loudnorm.

  5. Gersho, A., Gray, R.M. 1991. Vector quantization and signal compression. Kluwer Academic Publishers, Boston, MA.

  6. Github. https://github.com/wiseman/py-webrtcvad/

  7. Hasan, M.R., Jamil, M., Rabbani, G., Rahman, M.G.R.M.S. 2004. Speaker identification using Mel frequency cepstral coefficients. In: 3rd International Conference on Electrical & Computer Engineering, ICECE 2004. 28–30.

  8. Hermansky, H. 1990. Perceptual linear predictive (PLP) analysis of speech. The Journal of the Acoustical Society of America. 87, 1738–1752.

  9. Holambe, R., Deshpande, M. 2012. Advances in non-linear modeling for speech processing. Berlin, Heidelberg: Springer Science & Business Media.

  10. Kekre, H.B., Kulkarni, V. 2010. Speaker Identification by using Vector Quantization. International Journal of Engineering Science and Technology. 2, 1325–1331.

  11. Kumar, P, Chandra, M. 2011. Speaker identification using Gaussian mixture models. MIT International Journal of Electronics and Communication Engineering. 1, 27–30.

  12. Linde, Y., Buzo, A., Gray, R.M. 1980. An algorithm for vector quantizer design. IEEE Trans. Communication, COM-28, 84–95.

  13. Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N. 2019. Automatic speaker identification by voice based on vector quantization method, Int. J. Innov. Technol. Explor. Eng., 8, 2443–2445.

  14. Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N. 2019. Karakalpak speech recognition with CMU sphinx, Int. J. Innov. Technol. Explor. Eng., 8, 2446–2448.

  15. Rabiner, L.R. 1981. Digital processing of speech signals. –M.: Radio and communications, –496 p.

  16. Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C. 2009. An approach for objective assessment of stuttered speech using MFCC features. ICGST International Journal on Digital Signal Processing, DSP. 9, 19–24.

  17. Ravikumar, K.M., Reddy, B.A., Rajagopal, R., Nagaraj, H.C. 2008. Automatic detection of syllable repetition in read speech for objective assessment of stuttered Disfluencies. In: Proceedings ofWorld Academy Science, Engineering and Technology. 270–273.

  18. Savitzky, A., Golay, M.J.E. 1964. Smoothing and differentiation of data by simplified least squares procedures // Anal. Chem. 36, 1627–1639.

  19. SciPy.org. https://docs.scipy.org/doc/scipy0.15.1/reference/generated/scipy.signal. savgol_filter.html.

  20. Wiedecke, B., Narzillo, M., Payazov, M., Abdurashid, S. 2019. Acoustic signal analysis and identification, Int. J. Innov. Technol. Explor. Eng., 8, 2440–2442.

  21. Wu, Q.Z., Jou, I.C., Lee, S.Y. 1997. On-line signature verification using LPC cepstrum and neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics. 27, 148–153.


ARTICLE INFORMATION


Received: 2020-07-28

Accepted: 2020-11-03
Available Online: 2021-03-01


Cite this article:

Mamatov, N., Niyozmatova, N., Samijonov, A. 2021. Software for preprocessing voice signals. International Journal of Applied Science and Engineering, 18, 2020163. https://doi.org/10.6703/IJASE.202103_18(1).006

  Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.