WebRTC Voice Activity Detection

WebRTC Voice Activity Detection - javascript

I'm implementing a very simple WebRTC audio/video chat room using JavaScript and HTML.
It should run is all major browsers (Chrome, Firefox, Safari and Edge)
The application layout shows all the users currently collaborating, each one inside its own "square".
I would like to highlight the user that is currently speaking with a different square border color.
To the best of my knowledge the only way to implement this is using some built-in VAD API available and exposed to JS by the browser webrtc stack.
Could you please point me if this is possible, and if so, which API's should I be using ?
Code examples would be very useful.

I've never tried it, but I think you can extract the C++ code from WebRTC related to VAD and build it using WASM.

Related

How does Disney+ prevent screen recordings of their content?

I tried to take a screenshot of a movie on the Disney+ web app when I realised that the video turns black as soon as I try to take a new screenshot with Snipping Tool. When I tried to do the same thing with OBS and Discord streams, I saw the same effect.
Interestingly, this only works for Chrome on my machine (I also tried Firefox and Edge and they just let me record my screen).
When I saw this, I became really curious on how they achieved this.
Does anyone have any idea how I can recreate this for my own web projects?

I became really curious on how they achieved this.
They use Widevine.
Widevine homepage.
https://ottverse.com/widevine-drm-how-does-it-work/
https://en.wikipedia.org/wiki/Widevine
News reports:
https://www.cordcuttersnews.com/sadly-disney-wont-work-on-chromebooks-linux-some-android-devices-because-of-drm/
https://www.tomsguide.com/news/disney-plus-will-work-on-chromebooks
https://www.androidpolice.com/2019/10/22/disney-will-only-work-on-devices-that-support-the-strictest-widevine-l3-drm/
It's also used by Netflix, Hulu and others.
Widevine is Google's DRM system that's baked-in to Chrome.
All other major browsers have adopted it as well, because no-one will use a browser that can't access Netflix.
Mozilla's and Microsoft's support is less user-hostile and as you noticed.
It's just a standard HTML5 <video> element - when the browser downloads the video stream it will see that it's encrypted with Widevine and that engages the Widevine client-side code which does all the DRM biz.
Though there are HTML and DOM features that facilitate DRM, I'm unsure of the extent that any JavaScript is required to use it - as theoretically everything the browser needs to know to load the DRM system should be embedded in the raw media stream.
On Windows, I understand (though unconfirmed) that Widevine makes use of SetWindowDisplayAffinity to block screenshots.
Nothing stops you from doing this in your own native code (e.g. if you had an Electron fork), but please don't because it's a real dick-move to your users, in addition to not working at all if the user has the DWM disabled (e.g. they're running Windows 7 with Aero disabled).
Has anyone any idea how I can recreate this for my own web projects?
You'll need to license Widevine yourself. This is a complicated process intended only for large media production companies and content rightsholders, not individuals or small businesses.
Anyway, even if you could, please don't. Why would you want to make to harder for users to share and appreciate your media? Just stick it up on YouTube instead.

Using SSDP to display all devices in network

I have googled this question quite often but am still a little confused as to whether what exactly I'm trying to do is possible or not.
Basically, I am trying to add a dropdown menu to my web application in which it lists all devices connected to the network. When I say devices, I'm not talking about all devices; I am talking about certain hardware devices that I am using in which SSDP is implemented. I have already created Node.js programs that send M-SEARCHes and successfully find all the devices but I understand that Node.js is not a browser javascript and there is no way I could display the output of a Node call in a terminal on a browser (please correct me if I am wrong).
After doing a bit more research into it, I realized that alternatives when doing something of this sort on a browser is to either create some sort of Chrome extension that is able to do SSDP and send M-searches, or to open websockets using a websocket API (don't think this is particularly useful in my case for SSDP but I may be wrong).
Given what I am trying to do, are either of these alternatives helpful. Is what I am trying to do even possible? Once again, I have done my research in this topic but I really haven't been able to find a clear answer. If it is possible, I'd really appreciate links to tutorials or just general ideas on how to accomplish what I am trying to do.
I know I posted something on StackOverflow recently about this, which got no answers or replies, but I have done more research into this topic and felt like I do have a better understanding. That being said, I'd still appreciate some direction as to how to approach this problem as I haven't found anything too useful online.
Thank you for your time!

Chrome extensions cannot access the sockets.udp API as far as I know. The right place to do that in Chrome would probably have been a Chrome App, as they can do UDP Multicast: https://codereview.chromium.org/12684008/ . In fact there seems to be an SSDP app already ...
Unfortunately Chrome Apps have been deprecated in favor of normal web apps (outside of Chrome OS at least), and as you've found out you can't do SSDP through normal web APIs yet. The socket API is under works but there's no telling if and when they might solve the security problems inherent in allowing a random web app to do things like join a local multicast group.
Websockets are unlikely to provide what you need.

Its possible.
Node.js is not a browser javascript and there is no way I could display the output of a Node call in a terminal on a browser
They both run Javascript. Run your nodjs in a terminal or pipe the output to a text file if terminal is not accessible. in both cases console.log() should be able to print out.
For SSDP on client and server side, use this : https://www.npmjs.com/package/node-ssdp
You need not use a Chrome app specically. You can write apps in Javascript based cross platform frameworks like Electron. Itll become a fully functional 'web'-app for PCs and for mobiles you can use Cordova and the likes.

Can I use javascript to record voice on a web app?

It seems I can only use Flash or Java to record voice on a web app. Is there a way of doing it via JavaScript?

It can be done but the solution won't work across all platforms at the moment.
<input type="file" accept="audio/*;capture=microphone">
See HTML5 Media Capture
Currently Supported By:
Android 3.0 browser, Chrome for Android (0.16), Firefox Mobile 10.0, iOS6 Safari and Chrome (partial support)
Links:
http://www.html5rocks.com/en/tutorials/getusermedia/intro/
Audio capturing with HTML5

Javascript cannot access your hardware directly. What you need, is a client side technology that can. Flash, for one.
Javascript can communicate quite easily with flash, so you can hide your flash recorder and construct your recorder ui with html/js/css.
Here's one example: https://github.com/jwagener/recorder.js/blob/master/examples/example-1.html
Here's another one:
http://blogupstairs.com/flashwavrecorder-javascript-flash-audio-recorder/
I realize this is not EXACTLY what you need, but you didn't tell why you want a JS solution. This doesn't fix the flash dependency problem but it solves the UI problem since you can construct the UI without flash.

Another wellknow solution is WAMI, I know it's not pure javascript but maybe it can help.
"As of this writing, most browsers still do not support WebRTC's getUserMedia(), which promises to give web developers microphone access via Javascript. This project achieves the next best thing for browsers that support Flash. Using the WAMI recorder, you can collect audio on your server without installing any proprietary media server software."
https://code.google.com/p/wami-recorder/
Another example using node.js
This example application is written in JavaScript and uses Node and Express for the web server and framework. You will need all three installed on your web server in order for this to work, as well as the Node.js WebAPI Library.
nodejs voice recording example

Yes there is a pure HTML/JavaScript way but it only works in Firefox and Chrome:
http://audior.ec/blog/recording-mp3-using-only-html5-and-javascript-recordmp3-js/
Direct demo: http://audior.ec/recordmp3js/

Adding window.bluetooth object to Chrome

I want to demo a web page being used to interact with a physical object in the same proximity as a web-enabled device (Mac/Windows/Linux laptop). In order to do this, I want to create my own window.bluetooth object in Javascript that will provide an interface to the host device's Bluetooth controller via the Serial Port Profile. For now it's just a demo, but I might want to develop a generic API to abstract Bluetooth drivers in Javascript.
I'm not particularly concerned with portability or generic solutions at this point. I just want to see if it would work on my laptop with a device I'm building using a BlueSMiRF Silver modem. I know Google Chrome extensions are capable of injecting Javascript into every page the user visits, and NPAPI is capable of compiling native OS code into a form that can communicate with Javascript. It looks like someone has done something vaguely similar before with slightly more specific applications.
My question is, is a Chrome extension with NPAPI the best way to do this? Alternatives could be Flash or a Java applet, but those are kind of 1996 solutions. Here are the metrics I use to evaluate a solution:
Feasibility. Is it possible?
Ease of development. How many lines of code would it take?
Leverage. Does anything else out there already do something similar?
For those of you thinking it's preposterous for the browser to monitor lower-level network status, it's already been done with Wi-Fi.

Offline / Non-Realtime Rendering with the Web Audio API

The Problem
I'm working on a web application where users can sequence audio samples and optionally apply effects to the musical patterns they create using the Web Audio API. The patterns are stored as JSON data, and I'd like to do some analysis of the rendered audio of each pattern server-side. This leaves me with two options, as far as I can see:
Run my own rendering code server-side, trying to make it as faithful as possible to the in-browser rendering. Maybe I could even pull out the Web Audio code from the Chromium project and modify that, but this seems like potentially a lot of work.
Do the rendering client-side, hopefully faster-than-realtime, and then send the rendered audio to the server. This is ideal (and DRY), because there's only one engine being used for pattern rendering.
The Possible Solution
This question lead me to this code sample in the Chromium repository, which seems to indicate that offline processing is a possibility. The trick seems to be constructing a webkitAudioContext with some arguments (usually, a zero-argument constructor is used). The following are my guesses at what the parameters mean:
new webkitAudioContext(2, // channels
10 * 44100, // length in samples
44100); // sample rate
I adapted the sample slightly, and tested it in Chrome 23.0.1271.91 on Windows, Mac, and Linux. Here's the live example, and the results (open up the Dev Tools Javascript Console to see what's happening):
Mac - It Works!!
Windows - FAIL - SYNTAX_ERR: DOM Exception 12
Linux - FAIL - SYNTAX_ERR: DOM Exception 12
The webkitAudioContext constructor I described above causes the exception on Windows and Linux.
My Question
Offline rendering would be perfect for what I'm trying to do, but I can't find documentation anywhere, and support is less-than-ideal. Does anyone have more information about this? Should I be expecting support for this in Windows and/or Linux soon, or should I be expecting support to disappear soon on Mac?

I did some research on this a few months back, and there is a startRendering function on the audioContext, but I was told by Google people that the implementation was, at that time, due to change. I don't think this has happened yet, and it's still not a part of the official documentation, so I'd be careful building an app that depends on it.
The current implementation doesn't render any faster than realtime either (maybe slightly in very light applications), and sometimes even slower than realtime.
Your best bet is hitting the trenches and implement Web Audio server-side if you need non-realtime rendering. If you could live with realtime rendering there's a project at https://github.com/mattdiamond/Recorderjs which might be of interest.
Please note that I'm not a googler myself, and what I was told was not a promise in any way.

We Keep Coding

JavaScript is the programming language of the Web.