a blog for those who code

Saturday 31 October 2015

Speech to Text Application in Node.js

In this post we will show you how to create a speech-to-text application in Node.js using Edge.js. Speech to Text application actually means that you will speak and the application will convert that to text. Edge.js connects Node.js and .Net together, where you will actually write the code in C# but it will be ran from Node.js.
If you are not aware of Edge.js, read Write .Net and Node.js code together using Edge.js. Using Edge.js you connect Node.js and .Net on Windows, MacOS, and Linux. You need to at first install Edge before proceeding with the code.

Recommended Reading : Simple Text to Speech Application in Node.js

Code of Speech to Text :

var edge = require('edge');
var value = edge.func(function() {/*
  #r "System.Speech.dll"
  #r "System.Threading.dll"

  using System;
  using System.Threading.Tasks;
  using System.Speech.Synthesis;
  using System.Speech.Recognition;
  using System.Threading;

  public class Startup
  {
    public ManualResetEvent _completed { get; set; }
    public async Task<object> Invoke(object input)
    {
      _completed = new ManualResetEvent(false);
      SpeechRecognitionEngine recog = new SpeechRecognitionEngine();
      Console.WriteLine("Speak Now...");
            
      recog.LoadGrammar(new Grammar(new GrammarBuilder("one")) { Name = "one" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("two")) { Name = "two" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("three")) { Name = "three" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("four")) { Name = "four" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("five")) { Name = "five" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("six")) { Name = "six" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("seven")) { Name = "seven" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("eight")) { Name = "eight" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("nine")) { Name = "nine" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("ten")) { Name = "ten" });
      recog.LoadGrammar(new Grammar(new GrammarBuilder("exit")) { Name = "exit" });
            
      recog.SpeechRecognized += recog_SpeechRecognized;
      recog.SetInputToDefaultAudioDevice();
      recog.RecognizeAsync(RecognizeMode.Multiple);
      _completed.WaitOne();
      recog.Dispose();
      return null;
    }

    public void recog_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
    {
      if (e.Result.Text == "one" || e.Result.Text == "two" || e.Result.Text == "three" || e.Result.Text == "four" || e.Result.Text == "five" || e.Result.Text == "six" || e.Result.Text == "seven" || e.Result.Text == "eight" || e.Result.Text == "nine" || e.Result.Text == "ten")
      {
        Console.WriteLine(e.Result.Text);
        using (SpeechSynthesizer synth = new SpeechSynthesizer())
        {
          SpeechSynthesizer reader = new SpeechSynthesizer();
          Prompt prompt = new Prompt("You Spoke " + e.Result.Text);
          reader.Speak(prompt);
        }
      Console.WriteLine("Speak Now...");
      }
      if(e.Result.Text == "exit")
      {
        Console.WriteLine("Thanks for Speaking. Exiting Now");
        _completed.Set();
      }
    }
  }
*/});

value(null, function (error, result) {
if(error) throw error;
});

Edge.js only reference System and System.Threading.Tasks, so for any other references in Edge.js we need to include the reference as #r "DLLName.dll".Like a Console Application in C# where you need to have a static main method, Edge.js needs a Class Startup and this class must invoke method that matches the Func<object, Task<object>> delegate.

Inside the delegate at first we are creating a ManualResetEvent, which will help us to know when our speaking has completed. Then we are loading a grammar one to ten. We have also added a SpeechRecognized delegate which will check that if the spoken text matches with any of the grammar loaded. On speaking "exit", the application will be closed. We have set the default audio device by the command recog.SetInputToDefaultAudioDevice();

If your system did not have Visual Studio installed then you need to download the System.Speech.dll and System.Threading.dll from the web and keep it in the same folder as your speech.js file as shown below.


Please Like and Share the CodingDefined Blog, if you find it knowledgeable and helpful.

1 comment:

  1. Thanks for the tutorial,
    I used this in my web app but all the html component (buttons) stop working when the microphone is active, kindly help?

    ReplyDelete