Software for preprocessing voice signals

Narzillo Mamatov; Nilufar Niyozmatova; Abdurashid Samijonov

doi:10.6703/IJASE.202103_18(1).006

Software for preprocessing voice signals

Narzillo Mamatov*, Nilufar Niyozmatova, Abdurashid Samijonov

Tashkent University of Information Technologies named after Al-Kharezmi, Tashkent, Uzbekistan

Download Citation: |
Download PDF

ABSTRACT

One of the most important tasks of modern science is the development of software tools for human communication with devices (for example, a computer) in natural language, where speech input and output of information is carried out in the most user-friendly way. To create such tools, it is required to solve speech recognition problems. On the basis of many experimental studies, it can be concluded that the quality of speech recognition depends on the results of preliminary signal processing. Improving the quality of speech recognition requires new efficient and high-speed signal preprocessing methods and algorithms.
This article proposes a new approach and algorithm for the formation of signs of speech signals. Based on these features obtained by the proposed algorithm, the identification problem is solved. The article also provides a description of the software module for each stage of preprocessing of speech signals. The developed software is a voice-based identification tool.

Keywords: Algorithm, Signal, Speech signal, Filter, MFCC, PLP, LPCC.

Share this article with your colleagues

REFERENCES

Chakroborty, S., Roy, A., Saha, G. 2006. Fusion of a complementary feature set with MFCC for improved closed set text-independent speaker identification. In: IEEE International Conference on Industrial Technology, ICIT 2006. 387–390.
Chu, S., Narayanan, S., Kuo, C.C. 2008. Environmental sound recognition using MP-based features. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008. IEEE, 1–4.
El Choubassi, M.M., El Khoury, H.E., Alagha, C.E.J., Skaf, J.A., Al-Alaoui, M.A. 2003. Arabic speech recognition using recurrent neural networks. In: Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795). Ieee, 543–547. DOI: 10.1109/ISSPIT.2003.1341178.
FFmpeg. https://ffmpeg.org/ffmpeg-filters.html#loudnorm.
Gersho, A., Gray, R.M. 1991. Vector quantization and signal compression. Kluwer Academic Publishers, Boston, MA.
Github. https://github.com/wiseman/py-webrtcvad/
Hasan, M.R., Jamil, M., Rabbani, G., Rahman, M.G.R.M.S. 2004. Speaker identification using Mel frequency cepstral coefficients. In: 3rd International Conference on Electrical & Computer Engineering, ICECE 2004. 28–30.
Hermansky, H. 1990. Perceptual linear predictive (PLP) analysis of speech. The Journal of the Acoustical Society of America. 87, 1738–1752.
Holambe, R., Deshpande, M. 2012. Advances in non-linear modeling for speech processing. Berlin, Heidelberg: Springer Science & Business Media.
Kekre, H.B., Kulkarni, V. 2010. Speaker Identification by using Vector Quantization. International Journal of Engineering Science and Technology. 2, 1325–1331.
Kumar, P, Chandra, M. 2011. Speaker identification using Gaussian mixture models. MIT International Journal of Electronics and Communication Engineering. 1, 27–30.
Linde, Y., Buzo, A., Gray, R.M. 1980. An algorithm for vector quantizer design. IEEE Trans. Communication, COM-28, 84–95.
Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N. 2019. Automatic speaker identification by voice based on vector quantization method, Int. J. Innov. Technol. Explor. Eng., 8, 2443–2445.
Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N. 2019. Karakalpak speech recognition with CMU sphinx, Int. J. Innov. Technol. Explor. Eng., 8, 2446–2448.
Rabiner, L.R. 1981. Digital processing of speech signals. –M.: Radio and communications, –496 p.
Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C. 2009. An approach for objective assessment of stuttered speech using MFCC features. ICGST International Journal on Digital Signal Processing, DSP. 9, 19–24.
Ravikumar, K.M., Reddy, B.A., Rajagopal, R., Nagaraj, H.C. 2008. Automatic detection of syllable repetition in read speech for objective assessment of stuttered Disfluencies. In: Proceedings ofWorld Academy Science, Engineering and Technology. 270–273.
Savitzky, A., Golay, M.J.E. 1964. Smoothing and differentiation of data by simplified least squares procedures // Anal. Chem. 36, 1627–1639.
SciPy.org. https://docs.scipy.org/doc/scipy0.15.1/reference/generated/scipy.signal. savgol_filter.html.
Wiedecke, B., Narzillo, M., Payazov, M., Abdurashid, S. 2019. Acoustic signal analysis and identification, Int. J. Innov. Technol. Explor. Eng., 8, 2440–2442.
Wu, Q.Z., Jou, I.C., Lee, S.Y. 1997. On-line signature verification using LPC cepstrum and neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics. 27, 148–153.

ARTICLE INFORMATION

Received: 2020-07-28

Accepted: 2020-11-03
Available Online: 2021-03-01

Cite this article:

Mamatov, N., Niyozmatova, N., Samijonov, A. 2021. Software for preprocessing voice signals. International Journal of Applied Science and Engineering, 18, 2020163. https://doi.org/10.6703/IJASE.202103_18(1).006

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Software for preprocessing voice signals

ABSTRACT

REFERENCES

ARTICLE INFORMATION

Other people also read ...

Monitoring soil resilience via the dynamic changes of selected physicochemical properties of soil in a tropical rehabilitated forest

Efficacy of real-time audio biofeedback on physiological strains for simulated tasks with medium and heavy loads

An alternative framework for implementing generator coherency prediction and islanding detection scheme considering critical contingency in an interconnected power grid

Usability evaluation for driving simulation with the mechanical and joystick manual controllers

Formulation, characterization, and optimization of aripiprazole-loaded lyotropic liquid crystalline nanoparticle for sustained release and better encapsulation efficiency against psychosis disorder

Influence of palm oil mills effluent (POME) sludge vermicomposting on soil physicochemical properties and Zea mays growth performances

IJASE - Most Read Articles

IJASE - Most popular articles

New algorithm to ensure virtual simulation data security based on deep learning using applied innovation design

Metal recognition behaviour study of coumarin containing benzimidazole moiety

An automated optical shelf welding pre-inspection system

About IJASE

Articles

For Authors

Publisher