DEV Community

Dilek Karasoy for Picovoice

Posted on

"Pico Chess, start a new game": .NET Speech Recognition Tutorial

ChessCore is an open-source cross-platform chess engine in .NET Core with a text-based interface. Our team decided to voice enable it to showcase how easy it is to work with Picovoice .NET SDK


Let's get started:
1. Remove the text-based interface to replace it with voice user interface:
ChessCore keeps the chess-playing engine separate from the interface, allowing developers to replace the text-based interface with voice user interface easily.

Once you extract the useful items from the original Program.cs, so you will have something like this:

class Program
{
    static readonly Engine gameEngine = new Engine();

    static void Main(string[] _)
    {
        // so we can see actual chess pieces on the board!
        Console.OutputEncoding = Encoding.UTF8;

        // start game loop
        RunGame();  
    }

    // game control
    static void RunGame() { ... }
    static void NewGame() { ... }
    static void QuitGame() { ... }  

    // piece movement
    static string MakePlayerMove(string srcSide, string srcFile, string srcRank, 
        string dstSide, string dstFile, string dstRank) { ... }
    static string MakeOpponentMove() { ... }
    static void UndoMove() { ... }

    // end game logic
    static bool CheckEndGame() { ... }
    static string GetEndGameReason() { ... }

    // translation functions
    static byte GetRow(string move) { ... }
    static string GetRow(byte row) { ... }
    static string GetColumn(byte col) { ... }
    static byte GetColumn(string side, string file) { ... }
    static string GetPieceSymbol(ChessPieceColor color, ChessPieceType type) { ... }

    // UTF-8 board to console
    static void DrawBoard(string aboveBoardText) { ... }
}
Enter fullscreen mode Exit fullscreen mode

2. Design the Voice User Interface
To build a hands-free app understanding commands like "Pico Chess, start a new game", we need engines to detect the hotword "Picovoice" and to users' intent "starting a game." The first one is powered by Porcupine Wake Word, and the latter by Rhino Speech-to-Intent.
2.1. Train a custom hotword

Sign up for the Picovoice Console for free if you haven't and go to the Porcupine section. Simply type “Pico Chess” or hotword of your choice in multiple languages. Then select what platform you want - for a cross-platform experience, train one for Windows, one for Linux and one for macOS.

2.2. Train a context to understand follow-up commands
Go to the Rhino section on the Picovoice Console and create a new context. You can design your own model from scratch, but for the sake of simplicity, just download this YAML file and import it. It will be easier to adjust the existing context especially if this is your first project. Then train and download the model.

PS: Grab your AccessKey from the Picovoice Console, while you're there. You'll need it shortly.

3. Wire it up!
Now it's time to add the Picovoice NuGet package and voice AI model files to the ChessCore project.

static string _platform => RuntimeInformation.IsOSPlatform(OSPlatform.OSX) ? "mac" :
               RuntimeInformation.IsOSPlatform(OSPlatform.Linux) ? "linux" :
               RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ? "windows" : 
               "";

static void RunGame()
{
    // init picovoice platform
    string accessKey = "..."; // replace with your Picovoice AccessKey
    string keywordPath = $"pico_chess_{_platform}.ppn";
    string contextPath = $"chess_{_platform}.rhn";

    using Picovoice picovoice = Picovoice.Create(
            accessKey,
        keywordPath, 
                wakeWordCallback, 
                contextPath, 
                inferenceCallback);

    DrawBoard();

    // start play
    // ...
}

static void WakeWordCallback()
{
    Console.WriteLine("\n Listening for command...");
}

static void InferenceCallback(Inference inference)
{
    // logic for when Rhino infers an intent
}
Enter fullscreen mode Exit fullscreen mode

The .NET SDK Inference class has three immutable properties:

IsUnderstood: whether Rhino Speech-to-Intent matched one of the commands or not
Intent: if understood, which intent was inferred
Slots: if understood, a dictionary with data relating to the intent

If you used the existing YAML file, the intents are move, newGame, undo and quit.
Please note that Slots [dictionary for source and destination coordinates] will only be used with move, and will be empty for other intents. So the inference callback will look something like this:

static void InferenceCallback(Inference inference)
{
    if (inference.IsUnderstood)
    {           
        if (inference.Intent.Equals("move"))
        {
            if (CheckEndGame()) 
                return;

            // get source coordinates
            string srcSide = inference.Slots["srcSide"];
            string srcRank = inference.Slots["srcRank"];
            string srcFile = inference.Slots.ContainsKey("srcFile") ? 
            inference.Slots["srcFile"] : "";

            // get destination cooordinates
            string dstSide = inference.Slots["dstSide"];
            string dstRank = inference.Slots["dstRank"];
            string dstFile = inference.Slots.ContainsKey("dstFile") ? 
            inference.Slots["dstFile"] : "";

            // try to make player move
            string playerMove = MakePlayerMove(srcSide, srcFile, srcRank, 
                                               dstSide, dstFile, dstRank);
            if (playerMove.Equals("Invalid Move"))
            {
                DrawBoard($" {playerMove}\n");
                    return;
            }

            // make opponent move if player move was valid
            string theirMove = MakeOpponentMove();
            DrawBoard($" \u2654  {playerMove}\n \u265A  {theirMove}");

            // end game if necessary
            if (CheckEndGame())
            {
                Console.WriteLine($"\n {GetEndGameReason()}");
                Console.WriteLine($" Say 'new game' to play again.");
            }
        }
        else if (inference.Intent.Equals("undo"))
        {
            UndoLastMove();
        }
        else if (inference.Intent.Equals("newgame"))
        {
            NewGame();
        }
        else if (inference.Intent.Equals("quit"))
        {
            QuitGame();
        }
    }
    else
    {
        DrawBoard(" Didn't understand move.\n");
    }
}
Enter fullscreen mode Exit fullscreen mode

4. Get PicoChess listen to commands:
If you think cross-platform microphone control is challenging in .NET, you're not alone. That's why Picovoice built the PvRecorder.

static bool _quitGame = false;

static void RunGame()
{       
    // init picovoice platform
    string accessKey = "..."; // replace with your Picovoice AccessKey
    string keywordPath = $"pico_chess_{_platform}.ppn";
    string contextPath = $"chess_{_platform}.rhn";

    using Picovoice picovoice = Picovoice.Create(
            accessKey,
        keywordPath, 
                wakeWordCallback, 
                contextPath, 
                inferenceCallback);

    DrawBoard();

    // create and start recording
    using (PvRecorder recorder = PvRecorder.Create(-1, picovoice.FrameLength))
    {
    recorder.Start();

    Console.WriteLine($"Using device: {recorder.SelectedDevice}");
    Console.WriteLine("Listening...");

    while (!_quitGame)
    {
        short[] pcm = recorder.Read();
        picovoice.Process(pcm);

        Thread.Yield();
    }
    }
}
Enter fullscreen mode Exit fullscreen mode

Voila!

Resources:
Original Medium Article
Tutorial Source Code GitHub
Picovoice Console

Top comments (0)