How to integrate the Google Speech Recognition API in a Webpage? - javascript

I want to use the Google Speech Recognition API Google Speech API Reference into a Webpage by sending the request directly from the browser to Google.
Test requests using base64 encodes sample files already worked.
I am trying to use https://github.com/higuma/ogg-vorbis-encoder-js to record a ogg and send it to the API.
The request itself is done by a simple ajax request.
Did anybody implement it already in a web-browser (I also need iOS support, which should be possible since Safari recently updated and now supports recording)?
At the moment i am a bit stuck since the api just answers "{}"

Related

how to setup and use W3C Web Speech API from a local html file?

I want to make use of W3C Web Speech API, which already have a demo at https://www.google.com/intl/en/chrome/demos/speech.html.
I want to enhance it as a continuous speech recognition service and hopefully it can be run from a local html file on an android phone which can have features added to its currently demo. Now, I find that the local file cannot access the microphone when running in Chrome or android webview.
I know little how to setup this API. For example, what files I need to install in order to make use of this API?
Also, does this recognition service needed to be paid?

Can I save or record the audio I used in Web Speech API, also can I import audio or video? Javascript

I will make a transcript web app. I don't know if that is possible using javascript. Im not sure if I can combine Web Speech API and MediaStream Recording API. Please help me 😊
I want to find ways I can make a transcript web app with Javascript.

Web Speech API filter audio input

I am using Web Speech API for a chrome extension. Link
What I want to do is filter the audio being sent to the recognition instance. Does anyone know of a way to do this? The API really only gives you controls to the output of the recognition.

why javascript Speech Recognition api is not working without internet?

I was working with javascript speech recognition api(new webkitSpeechRecognition()) and i amazed why it is not working without internet since it is javascript code so it should work offline
I checked the network section of chrome developer tools, it is even not making request to internet
On Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won't work offline.
Looking at https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition:
SpeechRecognition.serviceURI Specifies the location of the speech recognition service used by the current SpeechRecognition to
handle the actual recognition. The default is the user agent's default
speech service.
The actual recognition is done by a 3rd party server.I assume the task of speech recognition currently is just too much for a browser to cope with on it's own or requires a big database.

Is it possible and advisable to call Google Cloud Speech APIs directly from browsers, including Safari?

I am starting to explore Google Cloud Speech APIs.
I have read that
"Speech API supports any device that can send a REST request"
Therefore I am thinking that potentially I could call such APIs from any Browser (both on laptops and on mobile devices). Specifically I am interested in scenarios where the APIs are used to translate "voice" to text. I am figuring out something like the following:
the user records his/her voice and stream it to the API
the API transform it to text which is sent back to the browser
the browser takes actions using the text received (e.g. saves the
text on a back end DB)
I have searched a bit, collected some information, but I have some big areas of doubt which I would like to clear before actually moving along this path
Is it possible and simple to call Google Cloud APIs directly from
the browser, i.e. using Javascript? The doubt comes from the fact
that the documentation shows nodejs examples but not pure
javascript ones
Can this scenario be implemented using Safari (both on desktop and
on mobile)? The doubt comes from the fact that all the searches I have made so far point to pages where I read that Safari does not support Audio recording (i.e. the
getUserMedia API of HTML5)
Any direction on these points will be very much appreciated.
From iOS11, Apple has added supporting the getUserMedia API.
You can find out more here.
Update
Streaming Speech Recognition is a potential solution for streaming audio (https://cloud.google.com/speech/docs/streaming-recognize)

Categories