Sequence Diagrams
Sequence Diagram 1 - Voice Recognition
Normal flow:
- Developer selects online or offline mode.
- Developer initializes model.
- System downloads model module.
- Developer clicks Start Listening.
- System begins listening and records the utterance.
- API transcribes the audio.
- The transcription is sent back to the game and displayed to the developer.
- Developer clicks Stop Listening.
- System stops listening.
Alternate flows / exceptions:
- Model package not downloaded: show prompt “Init failed.”
Sequence Diagram 2 - Extract Commands
Normal flow:
- The system captures AAC board voice input.
- SpeechConverter transcribes the audio into text (e.g., “please jump now”).
- The transcription is tokenized by Command Converter.
- Tokenized transcription is filtered to remove filler words, sounds, and non-command words (e.g., "please" and "now").
- Remaining tokens are mapped to commands.
- Commands are sent to the game.
- Game displays and logs the commands.
Alternate flows / exceptions:
- No command, filter removes all meaningful words (e.g., utterance was “uh now”): no game action.
- Multiple possible commands: request quick confirmation (“Did you mean JUMP?”) or choose highest-confidence and log uncertainty.
Sequence Diagram 3 - Speaker Seperation
Normal flow:
- System captures mixed audio with multiple speakers.
- The speaker-separation model splits the audio stream into streams for each speaker.
- Model runs on the isolated player streams and transcribes the utterance.
- Transcription is normalized and mapped to a game command (e.g., PauseGame).
- API sends PauseGame to the game; UI confirms action.
- Log command and speaker attribution.
Alternate flows / exceptions:
- Separation uncertain / low confidence: show a quick confirmation prompt (“Did you say ‘pause’?”). If the player confirms, proceed; otherwise ignore.
Sequence Diagram 4 - Background Noise Filtering
Normal flow:
- System captures the noisy audio.
- Noise suppression/denoising module processes the audio to reduce background interference.
- ASR transcribes the cleaned audio.
- Transcription is matched to a command (e.g., “left” → MoveLeft).
- If confidence is high, API sends MoveLeft to the game and UI shows visual confirmation.
- Command and environment metadata (noise level) are logged.
Alternate flows / exceptions:
- Noise overwhelms voice: Show "no speech detected" note.
- Misrecognized phrase due to residual noise: if confidence low, ask for repeat or confirmation.
Sequence Diagram 5 - Interpret Synonyms of Commands
Normal flow:
- System captures the utterance and model transcribes it (e.g., “hop”).
- The command-mapping module looks up the token in the synonym table.
- “hop” is mapped to canonical command Jump.
- API issues Jump to the game.
- Provide visual confirmation, log synonym used, and log mapping confidence.
Alternate flows / exceptions:
- Developer disabled synonym mapping, and non-command words are filtered out.
Sequence Diagram 6 - Register Game Commands
Normal flow:
- AAC game developer uses the API toolkit to add commands like Start Game, red, blue, green.
- They tell the API what each command means and map those commands to game actions.
- Developer speaks. The API transcribes and tokenizes the audio.
- The game executes and logs the command.
Postconditions: System contains common commands in a command library. All commands for the AAC game are entered in the command library, and can be used by players through the API.
Sequence Diagram 7 - Toggle Input History
Normal flow:
- AAC player's caretaker uses the API window and goes to settings.
- The system has toggleable settings for input history.
- The caretaker toggles off the input history.
- AAC player receives reduced visual stimuli and can comfortably enjoy playing the AAC game.
Alternate flows / exceptions:
- Developer has registered a new command and uses the command history to troubleshoot the new command.
- He has confidence that it was registered correctly and working once he is able to see it in the command history.
Sequence Diagram 8 - Confidence Level of Interpreted Game Input
Normal flow:
- Developer speaks game commands into the microphone.
- The game command is interpreted and inputted to the game.
- Developer receives a confidence level from the API that determines how confident the API was in choosing that command based on synonyms to a known command.
- This allows him to have control over which commands are recognized as valid game inputs. ensuring that only reliable commands can affect the gameplay.
Alternate flows / exceptions:
- The game incorrectly interprets the voice input.
- Developer adjusts the code accordingly.