Аннотации:
This paper considers the
efficiency of neural networks
for human voice recognition.
The objects of the study are artificial neural networks used for
human voice recognition. Their
ability to effectively recognize a
human voice regardless of language, trained on a small number of speakers in noisy conditions, has been considered. The
task being solved is to enhance
the accuracy of speech activity detection, which plays a significant role in improving the
functioning of automatic speech
recognition systems, especially
under conditions of a low signal-to-noise ratio.
The findings showed that
the accuracy of human voice
recognition in languages of
different phonetic proximity
could vary greatly. As a result
of the study, it was found that
the recurrent neural network
(RNN) demonstrates high accuracy in voice recognition – 95 %,
which exceeds the results of the
convolutional neural network
(CNN), reaching an accuracy
of 94 %. Special features of the
results are the adaptation of
neural networks to multilingual
features, which made it possible to increase the efficiency of
their work. An important conclusion was that training neural networks on data with different languages and types of
speakers significantly improves
recognition accuracy. The study
confirmed that training neural
networks on different languages
and speaker types could significantly affect recognition accuracy. The results are an important
contribution to the development
of speech recognition technologies and have the potential for
application in various fields
where high accuracy in human
voice recognition is required