Transcription Backend

The backend transcription model is the system that receives raw audio, processes it, and returns the text version of what was spoken. It handles tasks like chunking audio, running it through a speech-to-text engine, and sending back structured transcription results.

Why It’s Useful

Using a backend model keeps all the heavy processing off the user’s device, giving faster and more reliable results. It also lets you centralize improvements (like better accuracy, noise handling, or new features) without requiring users to update anything. Overall, it makes transcription consistent, scalable, and easy to integrate with commands or game logic.

Setting up Backend Transcription for Online mode

For Mac/Linux

Clone this project.
Navigate to the folder called backend
Ensure you have python version 3.11 installed To check your version, use command python3 --version
Create a virtual environment using the command python3.11 -m venv venv
Enter the virtual environment using the command source venv/bin/activate
Install all the requirements using command pip install -r requirements.txt
Once all requirements have installed successfully, start the server using the command fastapi dev main.py

For Windows

Clone this project.
Navigate to the folder called backend
Ensure you have python version 3.11 installed. To check your version, use command python3 --version
Create a virtual environment using the command python3.11 -m venv venv
Enter the virtual environment using the command venv\Scripts\activate
Install all the requirements using command pip install -r requirementsWindows.txt
Once all requirements have installed successfully, start the server using the command fastapi dev main.py

When running the backend locally, you should see this when it is successfully running: Success Screen

Why It’s Useful​

Setting up Backend Transcription for Online mode​

For Mac/Linux​

For Windows​

Why It’s Useful

Setting up Backend Transcription for Online mode

For Mac/Linux

For Windows