Preparing the Environment

How to set up Discord, sound, whisper.cpp and everything else.

Installing Pi-Apps
Installing and Setting-Up Legcord
Creating a Dedicated Discord Account
Installing and Building the Right whisper.cpp
Download the Right Model(s)
Audio Input (the Almost Easy Way)
Preventing Interference
Setting-Up Files, Folders and Setting Permissions
(Optional) Activating VNC

Installing Pi-Apps

To install the Pi-Apps app store for Raspberry Pi, follow the instructions on their website:

wget -qO- https://raw.githubusercontent.com/Botspot/pi-apps/master/install | bash

Check the URL before running — piping directly to bash executes code without review.

The installed app store will reside in your home directory:

$HOME/pi-apps

Apps installed via Pi-Apps usually can be found in /opt. Keep in mind that /opt might be missing from the shell's PATH.

Installing and Setting-Up Legcord

The official Discord client currently isn't available for devices operating ARM processors on Windows or Linux². The app Legcord will act as a replacement.

Installation

Open Pi-Apps and go to Internet → Communication → Legcord to install.

Post-Installation Set-up

Start Legcord from the applications menu in the upper-left corner. This will start Legcord via the /opt/Legcord/legcord-wayland.sh script and thus with the right settings for the Raspberry Pi Wayland GUI. (Path may vary depending on the installed version.)

When starting Legcord for the first time, you'll be asked to set some basic options.

I chose the native windows style to limit possible compatibility problems down the line and to work in an environment that I'm already used to.

I've chosen Vencord over Equicord for stability and to prevent against getting my account banned for breaking Discord's Terms of Service.

I've chosen to enable the system tray icon as a personal preference.

²) There is an official Discord client for macOS on ARM processors, though.

Creating a Dedicated Discord Account

To limit the complexity and to not have to deal with mapping and capturing several audio devices, I chose to create a new Discord account that is solely used for running TranscriptOMatic. This is necessary because one account can't be joining a voice session from two devices or clients.

Apart from using a good password, consider using 2FA to secure the account.

People will sometimes see this secondary account as a bot.
It is not.

A bot offers functions that (most of the time) can be triggered in Discord or automatically without direct interaction by the owner of the bot.
In contrast, your dedicated second account has to be operated by a human - yourself - via a Discord client.

Installing and Building the Right whisper.cpp

Cloning the Github repository locally

cd ~
git clone https://github.com/ggerganov/whisper.cpp

Build additional libraries

sudo apt install -y libsdl2-dev

Building `whisper-stream`

The -j2 flag limits parallel build jobs to 2. Using -j4 or higher may cause the Pi to crash due to memory exhaustion.

cd ~/whisper.cpp
rm -rf build
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DWHISPER_SDL2=ON
cmake --build build --target whisper-stream -- -j2

Download the Right Model(s)

Download the tiny.en language model

→ this is the only model that will provide useful results on a Raspberry Pi 500

bash ~/whisper.cpp/models/download-ggml-model.sh tiny.en

Better hardware will support bigger language models.

Language Models: What is the difference?

Whisper.cpp provides several different language models. Execute the script ~/whisper.cpp/models/download-ggml-model.sh without parameters to see currently available models^#:

mela@Cox:~/meetings/recordings/2026-02-21T013852 $ ~/whisper.cpp/models/download-ggml-model.sh
Usage: /home/mela/whisper.cpp/models/download-ggml-model.sh <model> [models_path]

Available models:
  tiny tiny.en tiny-q5_1 tiny.en-q5_1 tiny-q8_0
  base base.en base-q5_1 base.en-q5_1 base-q8_0
  small small.en small.en-tdrz small-q5_1 small.en-q5_1 small-q8_0
  medium medium.en medium-q5_0 medium.en-q5_0 medium-q8_0
  large-v1 large-v2 large-v2-q5_0 large-v2-q8_0 large-v3 large-v3-q5_0 large-v3-turbo large-v3-turbo-q5_0 large-v3-turbo-q8_0

___________________________________________________________
.en = english-only -q5_[01] = quantized -tdrz = tinydiarize

Models that only support English are least demanding of resources, especially memory. Multilingual models provide support for 99 different languages, and contain, as such, necessarily some overhead.

While multilingual models from tiny to large all support the same set of languages, the size of the model determines how well Whisper.cpp handles accents, mumbling, people talking over each other or specialist language.

Since this describes the typical TTRPG environment pretty well, smaller language models — while running on inexpensive hardware — will provide only limited results.

^#) for an explanation of quantization and diarization look here

Audio Input (the Almost Easy Way)

How the TranscriptOMatic should operate is, to connect to Discord voice using a dedicated Discord account for this purpose. During sessions, the Discord client (Legcord) will join the discord voice session.

To reduce complexity, no microphone or speaker is attached to the device.

This setup makes it possible to capture only a single audio source. If you want to adapt this concept to a device you are actively using, you also need to capture your microphone input — Discord does not play your own voice back to you.

Preventing Interference

In Legcord, open the Discord Voice & Video settings:

Setting all Sounds to off
Setting the Soundboard Volume to off

Discord's notification sounds and soundboard audio would otherwise be picked up by the virtual microphone and fed into the transcription, giving you lower-quality results. This is especially relevant when working with a smaller model like tiny, which has less capacity to filter out irrelevant audio.

Setting-Up Files, Folders and Setting Permissions

Files & Folders

cd $HOME;
mkdir -p meetings/lib/ meetings/bin/ meetings/recordings/;
touch meetings/bin/meeting-start meetings/bin/meeting-stop meetings/bin/meeting-follow;
chmod 755 meetings/bin/meeting-start meetings/bin/meeting-stop meetings/bin/meeting-follow;
touch meetings/lib/paths.sh meetings/lib/whisper.sh;

Editing the Scripts

Enter the scripts' contents into the prepared files using your preferred editor.

If you are not comfortable using a command line editor like vi or nano, use any text or code editor on your main computer and copy the files by the way of scp or a (s)ftp client onto the Raspberry Pi 500.

Checking the Scripts Are Executable

Now check the permissions. meeting-start, meeting-stop and meeting-follow have to be executable. (This is done by chmod 755 above.)

ls -l meetings/bin/

You should see a result like this:

mela@Cox:~ $ ls -l meetings/bin/
total 20
-rwxr-xr-x 1 mela mela  526 Jan 11 08:51 meeting-follow
-rwxr-xr-x 1 mela mela 4558 Feb 21 00:17 meeting-start
-rwxr-xr-x 1 mela mela 2596 Feb 21 00:24 meeting-stop

-rwxr-xr-x means the file is executable (for user, group and world).

Adding the Scripts to the PATH environment variable

Add $HOME/meetings/bin/ to the command line's path.

vi .bashrc

Add a last line:

export PATH="$HOME/meetings/bin:$PATH"

Save, and close the editor. Then load the updated .bashrc into your active session by entering:

source .bashrc

(Optional) Activating VNC

Consider this if you want to run TranscriptOMatic without a display hooked up to the device.
VNC allows you to access the Raspberry Pi desktop remotely from another computer.

Log into your Raspberry Pi — either via ssh or via GUI, if you have a monitor (and keyboard) connected. On the GUI, open a terminal. Start raspi-config:

sudo raspi-config

Go to 3 Interface Options
Go to I3 VNC
Choose Yes
Leave raspi-config

Preparing the Environment

Installing Pi-Apps

Installing and Setting-Up Legcord

Installation

Post-Installation Set-up

Creating a Dedicated Discord Account

Installing and Building the Right whisper.cpp

Cloning the Github repository locally

Build additional libraries

Building whisper-stream

Download the Right Model(s)

Language Models: What is the difference?

Audio Input (the Almost Easy Way)

Preventing Interference

Setting-Up Files, Folders and Setting Permissions

Files & Folders

Editing the Scripts

Checking the Scripts Are Executable

Adding the Scripts to the PATH environment variable

(Optional) Activating VNC

Building `whisper-stream`