I want to create a commercial web application based on speech recognition. I have found the Web Speech API (https://w3c.github.io/speech-api/) currently only supported by Chrome.
Can I use this API for free for my commercial application? Is there a limit on the number of uses per day, or a free quota that I must not exceed?
From https://lists.w3.org/Archives/Public/public-speech-api/2013Jul/0001.html
To clarify, commercial apps that use the Web Speech
API https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html can
be used on browsers that support it, such as Chrome (for complete
details, see the Terms of Service in the About menu).
References:
https://groups.google.com/a/chromium.org/g/chromium-html5/c/31rlwXxzQGs/m/BbeSI_waCQAJ
More information can be searched here:
https://groups.google.com/a/chromium.org/g/chromium-html5/search?q=web%20speech%20api
Related
I am developing mobile automation framework, I got a requirement to capture network traffic of various API call when tapping on different sections of our app.
Basically, these network calls (API call) are used to verify application usage and generate analytics report to define future product development strategy.
I am using WebDriverIO with Appium for mobile UI automation. To verify this API call manually my team is using charles proxy.
I am exploring few options to automate this task but if you have any proven solution then please do share. Thanks in advance for your help.
Webdriver.io already support this feature: https://webdriver.io/docs/proxy/#proxy-between-browser-and-internet
They expose an API using https://github.com/lightbody/browsermob-proxy which is a proven solution for network capturing for mobile automation.
Web Bluetooth, from reading through the spec and the APIs, seems like it can only handle pairing with a BT device and then reading or writing to it.
But it doesn't seem to be able to expose a new service or a new characteristic on an existing service on the device in the machine the page utilizing Web Bluetooth runs on.
Is there a way I missed how Web Bluetooth can be used to create a new service for the duration of the page being open and allow other machines/devices to pair with the one running the script on the page and use this new service?
My main interest is mesh networking with BTLE using the Web Bluetooth API, but for that all devices running the page need to be able to not only connect to other peers, but also to be connectable-to by other peers. That part I have no idea how to achieve with the current API.
Can I get a definitive answer on whether it is possible to contribute new services to the BT device of the computer the script runs on? Links to this being discussed in the WGs and whatnot would also be great, I am interested in why it is not possible if that's the case.
I have learnt a crucial piece of nomenclature which now allows me to answer this question: peripheral mode.
Not all Bluetooth adapters support it and the Web Bluetooth standard is not looking like it's going to support web pages acting as beacons/peripherals any time soon:
https://github.com/WebBluetoothCG/web-bluetooth/issues/231
So, as of now, what I asked for is not possible.
I am starting to explore Google Cloud Speech APIs.
I have read that
"Speech API supports any device that can send a REST request"
Therefore I am thinking that potentially I could call such APIs from any Browser (both on laptops and on mobile devices). Specifically I am interested in scenarios where the APIs are used to translate "voice" to text. I am figuring out something like the following:
the user records his/her voice and stream it to the API
the API transform it to text which is sent back to the browser
the browser takes actions using the text received (e.g. saves the
text on a back end DB)
I have searched a bit, collected some information, but I have some big areas of doubt which I would like to clear before actually moving along this path
Is it possible and simple to call Google Cloud APIs directly from
the browser, i.e. using Javascript? The doubt comes from the fact
that the documentation shows nodejs examples but not pure
javascript ones
Can this scenario be implemented using Safari (both on desktop and
on mobile)? The doubt comes from the fact that all the searches I have made so far point to pages where I read that Safari does not support Audio recording (i.e. the
getUserMedia API of HTML5)
Any direction on these points will be very much appreciated.
From iOS11, Apple has added supporting the getUserMedia API.
You can find out more here.
Update
Streaming Speech Recognition is a potential solution for streaming audio (https://cloud.google.com/speech/docs/streaming-recognize)
Is there a way to make a browser aware of iBeacon devices in its proximity?
Similar to the way HTML5 Geolocation is working...
If not would this be something that can be achieved with a browser plug-in that can provide the detail to be consumed by javascript?
Unfortunately, no. No web browsers have implemented any bridges between beacon detection and JavaScript.
I don't think a plugin approach is possible on mobile browsers (either iOS or Android), because neither browser supports asynchronous communication between external apps and JavaScript in Mobile Safari / Mobile Chrome. The best you could do is have a custom app that responds to a beacon, then launches a web page in the browser. But I realize that isn't what you are asking for.
If you want to build a native app with JavaScript, you can use Cordova (aka PhoneGap) and use plugins that provide beacon support. My company has one for our ProximityKit beacon framework:
https://github.com/RadiusNetworks/proximitykit-plugin-cordova
There is also a Cordova plugin that has basic beacon support here:
https://github.com/petermetz/cordova-plugin-ibeacon
This looks promising (2016), Google Chrome developers site showing a desktop browser feature as a work in progress.
https://developers.google.com/web/updates/2015/07/interact-with-ble-devices-on-the-web?hl=en
Even though the Web Bluetooth API specification is not finalized yet,
the Chrome Team is actively looking for enthusiastic developers (I
mean you) to try out this work-in-progress API and give feedback on
the spec and feedback on the implementation.
Web Bluetooth API is currently available to be enabled experimentally
on your origin in Origin Trials, or locally on your machine using an
experimental flag. The implementation is partially complete and
currently available on Chrome OS, Chrome for Android M, Linux, and
Mac.
Go to chrome://flags/#enable-web-bluetooth, enable the highlighted
flag, restart Chrome and you should be able to scan for and connect to
nearby Bluetooth devices, read/write Bluetooth characteristics,
receive GATT Notifications and know when a Bluetooth device gets
disconnected.
https://github.com/WebBluetoothCG/web-bluetooth/blob/gh-pages/implementation-status.md
There's a W3C specification for this Web Bluetooth, but there's no support yet: http://caniuse.com/#search=bluetooth.
If you decide to write a Phonegap plugin implementing this spec will be a good starting point.
Can I improve Google speech API recognition by give him a words list (in my case the request of user is very predictable) to make recognition more accurate?
Correct answer is: no, you can't. =(
I can't speak for Chrome, but in Android they are quite clear that you cannot provide a grammar. In Android speech recognition you are limited to a choice of two models: "free form" and "web search".
See Android: Speech Recognition Append Dictionary?
For Google Cloud Speech API (Not Web Speech API) but some may find this useful:
Although currently in beta, Google has released new capability which allows you to
include a list of phrases to act as "hints" to Cloud Speech-to-Text. Providing these hints, a technique called speech adaptation, helps Speech-to-Text API to recognize the specified phrases from your audio data."
See https://cloud.google.com/speech-to-text/docs/context-strength