Use-case descriptions
Use Case 1 - Voice Recognition
Actor: AAC Game Developer
Triggering event: Developer runs their game to test it, and starts by clicking "Whisper Init" and activating mic.
Preconditions: Game is running and using the API; microphone access is granted; network is available.
Normal flow:
- Developer selects online or offline mode.
- Developer initializes model.
- System downloads model module.
- Developer clicks Start Listening.
- System begins listening and records the utterance.
- API transcribes the audio.
- The transcription is sent back to the game and displayed to the developer.
- Developer clicks Stop Listening.
- System stops listening.
Alternate flows / exceptions:
- Model package not downloaded: show prompt “Init failed.”
Postconditions: Game has started (or appropriate error/feedback displayed).
Use Case 2 - Extract Commands
Actor: AAC Player, AAC Game Developer
Triggering event: AAC user speaks while playing an AAC game, e.g., “please jump now.”
Preconditions: Game is in a state that accepts gameplay commands; microphone is active.
Normal flow:
- The system captures AAC board voice input.
- SpeechConverter transcribes the audio into text (e.g., “please jump now”).
- The transcription is tokenized by Command Converter.
- Tokenized transcription is filtered to remove filler words, sounds, and non-command words (e.g., "please" and "now").
- Remaining tokens are mapped to commands.
- Commands are sent to the game.
- Game displays and logs the commands.
Alternate flows / exceptions:
- No command, filter removes all meaningful words (e.g., utterance was “uh now”): no game action.
- Multiple possible commands: request quick confirmation (“Did you mean JUMP?”) or choose highest-confidence and log uncertainty.
Postconditions: Jump action executed; command history updated.
Use Case 3 - Speaker Separation
Actor: AAC Player and nearby non-player speakers (e.g., parent)
Triggering event: AAC player speaks a command while other people speak at the same time.
Preconditions: Online mode and speaker-separation model is enabled.
Normal flow:
- System captures mixed audio with multiple speakers.
- The speaker-separation model splits the audio stream into streams for each speaker.
- Model runs on the isolated player streams and transcribes the utterance.
- Transcription is normalized and mapped to a game command (e.g., PauseGame).
- API sends PauseGame to the game; UI confirms action.
- Log command and speaker attribution.
Alternate flows / exceptions:
- Separation uncertain / low confidence: show a quick confirmation prompt (“Did you say ‘pause’?”). If the player confirms, proceed; otherwise ignore.
Postconditions: Commands from both speakers recognized. System records speaker attribution and confidence.
Use Case 4 - Background Noise Filtering
Actor: AAC Player
Triggering event: AAC player issues a command in a noisy environment (e.g., TV).
Preconditions: Noise-robust ASR / denoising pipeline active; microphone picks up signal.
Normal flow:
- System captures the noisy audio.
- Noise suppression/denoising module processes the audio to reduce background interference.
- ASR transcribes the cleaned audio.
- Transcription is matched to a command (e.g., “left” → MoveLeft).
- If confidence is high, API sends MoveLeft to the game and UI shows visual confirmation.
- Command and environment metadata (noise level) are logged.
Alternate flows / exceptions:
- Noise overwhelms voice: Show "no speech detected" note.
- Misrecognized phrase due to residual noise: if confidence low, ask for repeat or confirmation.
Postconditions: Movement executed (or prompt shown); noise metrics recorded for debugging.
Use Case 5 - Interpret Synonyms of Commands
Actor: AAC Player; Developer (configures mapping)
Triggering event: AAC player uses a synonym (e.g., “go” for Move, “hop” for Jump).
Preconditions: Synonym mapping table exists (configured by developer or default set); Speech to text model and command mapper active.
Normal flow:
- System captures the utterance and model transcribes it (e.g., “hop”).
- The command-mapping module looks up the token in the synonym table.
- “hop” is mapped to canonical command Jump.
- API issues Jump to the game.
- Provide visual confirmation, log synonym used, and log mapping confidence.
Alternate flows / exceptions:
- Developer disabled synonym mapping, and non-command words are filtered out.
Postconditions: Correct canonical command executed.
Use Case 6 - Register Game Commands
Actor: AAC Game Developer
Triggering Event: Developer uses the API toolkit to set up the basic commands the game will understand.
Preconditions: Game uses the API.
Normal flow:
- AAC game developer uses the API toolkit to add commands like Start Game, red, blue, green.
- They tell the API what each command means and map those commands to game actions.
- Developer speaks. The API transcribes and tokenizes the audio.
- The game executes and logs the command.
Postconditions: System contains common commands in a command library. All commands for the AAC game are entered in the command library, and can be used by players through the API.
Use Case 7 - Toggle Input History
Actor: AAC Game Developer; AAC Player
Triggering Event: Player is overstimulated by the AAC game.
Preconditions: AAC game is running and game command history is visible to players.
Normal flow:
- AAC player's caretaker uses the API window and goes to settings.
- The system has toggleable settings for input history.
- The caretaker toggles off the input history.
- AAC player receives reduced visual stimuli and can comfortably enjoy playing the AAC game.
Alternate flows / exceptions:
- Developer has registered a new command and uses the command history to troubleshoot the new command.
- He has confidence that it was registered correctly and working once he is able to see it in the command history.
Postconditions: AAC game is playable without a visible command history.
Use Case 8 - Confidence Level of Interpreted Game Input
Actor: AAC Game Developer
Triggering Event: Developer is testing new commands through API speech input.
Preconditions: Game is in a state that accepts gameplay commands; microphone is active.
Normal flow:
- Developer speaks game commands into the microphone.
- The game command is interpreted and inputted to the game.
- Developer receives a confidence level from the API that determines how confident the API was in choosing that command based on synonyms to a known command.
- This allows him to have control over which commands are recognized as valid game inputs. ensuring that only reliable commands can affect the gameplay.
Alternate flows / exceptions:
- The game incorrectly interprets the voice input.
- Developer adjusts the code accordingly.
Postconditions: Game provides confidence level when it interprets gameplay commands.