Using of Open-Source Technologies for the Design and Development of a Speech Processing System Based on Stemming Methods
Abstract
This article discusses the idea of developing an intelligent and customizable automated system for real-time text and voice dialogs with the user. This system can be used for almost any subject area, for example, to create an automated robot - a call center operator or smart chat bots, assistants, and so on. This article presents the developed flexible architecture of the proposed system. The system has many independent submodules. These modules work as interacting microservices and use several speech recognition schemes, including a decision support submodule, third-party speech recognition systems and a post-processing subsystem. In this paper, the post-processing module of the recognized text is presented in detail on the example of Russian and English dictionary models. The proposed submodule also uses several processing steps, including the use of various stemming methods, the use of word stop-lists or other lexical structures, the use of stochastic keyword ranking using a weight table, etc.
Origin | Files produced by the author(s) |
---|