Telegram bot based on Python 3 for running Speech-To-Text (STT) and Text-To-Speech (TTS) queries. Languages supported: Russian, English (queries and interface).
Module for working with Telegram API: Aiogram.
Software for converting audio files into different formats: FFmpeg.
STT and TTS queries are performed using the following libraries:
Bot supports two voices (male and female), whose names are set in the configuration file.
Bot has a special function called extra_text_processing, which introduces additional methods of processing text received from Vosk. By going through it, the text can be made more human and correct in terms of writing. The materials required for this function are stored strictly in the src/etp.
Note: Created and tested on Windows platform, Python 3.11.4
The following steps are required for RantoVox to work correctly:
-
Clone the repository (download source code)
-
Create a virtual environment using
python -m venv venv
and activate it -
Install dependencies using pip with requirements.txt
-
Download latest vosk russian and english language models (the small model is more preferable), drop them into src/lang (You can use
make download
to automatically download and post, requires curl and tar) -
Create your own .env file in root folder with variables described in Environment file section.
git clone https://github.com/Ggorets0dev/rantovox-telegram-bot.git
cd rantovox-telegram-bot
pip install -r requirements.txt
The following commands are available in RantoVox:
-
start - Launch a bot for your account
-
help - Get an informational summary of the operating principles
-
setlocale - Set language of bot's interface
-
setvoice - Set voice gender for requests (TTS)
-
setlang - Set language for requests (STT)
A .env file with the following variables must be created before running the bot:
Name | Example | Default | Description |
---|---|---|---|
TELEGRAM_TOKEN | 1234567890:ABCDEFGHIJKLMNOPQRSTUVXYZabcdefghi | - | Access token to the created Telegram bot |
MALE_VOICE_NAME | Aleskandr | - | Name of the voice to be used in the male voiceover |
FEMALE_VOICE_NAME | Elena | - | Name of the voice to be used in the female voiceover |
RU_LANG_MODEL_DIRNAME | vosk-model-small-ru-0.22 | - | Name of folder with Russian language model (should be in src/lang) |
ENG_LANG_MODEL_DIRNAME | vosk-model-small-en-us-0.15 | - | Name of folder with Russian language model (should be in src/lang) |
MAX_REQUEST_INDEX | 100 | 1000 | A value from the range 0 - n will be assigned to the temporarily created files (affects the number of simultaneously served clients) |
ETP_ENABLED | False | True | Whether post processing of the raw text from the conversion will be used |
Note: Default - value that the bot will take on its own if the value is in the wrong format in the environment file
Note: Call the bot for a list of available votes, filling in all remaining variables. It will display a list of available values (be careful: not all voices support Russian and English at the same time)
Bot deletes all temporary files immediately after a TTS or STT request. All conversion is done on the host with the help of the libraries described above. Only user's Login and ID are recorded in logs when requesting, composition of request remains hidden to host.