I have several audio tracks, which i got from getUserMedia (microphone). Those are being transmitted via WebRTC.
I want to highlight the stream that is active at the moment, but checking the documentation for MediaTrack i cannot find any method to access something that allows me to determine if that object is the most active at the moment.
So, if there is a method to get the current output, with some filter i should be able to determine which one is the "most active" one and highlight it.
Does such method exist in the API? Is there another approach that i can take for that?
MediaStream Object has APIs refer to detect active stream but not its MediaStreamTrack.
Even if you want to detect active speaker via volume level you need to pass MediaStream to Web audio API - AudioContext to analyse it. example
If you have proper RTCPeerConnection then you can use getStats API. example
MediaStreamTrack doesn't have such a property. You can use the webaudio api as done by hark to get a volume indication and then determine who is speaking.
Your mileage may vary though, active speaker detection is a hard problem.
Related
I want to gather stats on the number of users having to fallback to TURN servers. Thus is there a way to find if a RTCPeerConnection is using a TURN server instead of "directly" communicating with a remote peer?
I've tried using pc.getStats() but that only gives me an object with a size property.
You want to use getSelectedCandidatePair. This will return the local/remote candidate that is being used. Each candidate will have a type host, srflx, prflx or relay. relay means it is using TURN.
Make sure to check both candidates. It is possible that both pairs are TURN (or maybe just one)
the getStats() result is a Javascript Map object. You can iterate it to find what you need. To get the active candidate pair (and then determine its type) it is best to follow the code from this sample (which works around the quirks of some browsers) and then check whether either the local or remote candidateType is 'relay'.
In firefox, I'm able to request a video stream of a window with
navigator.mediaDevices.getUserMedia({
video: {
mediaSource: 'window',
},
})
This produces a dialog like this:
I only care about the current window. Is there a way to specify in my call to getUserMedia that I would like the current tab (or window) only?
I don't think so no...
What FF implements here is not really specified yet, but w3c is working on a new API which will take care of Screen Capture: MediaDevices.getDisplayMedia.
While it's not what Firefox did implement, there is a clear paragraph in this paper about why deviceId can't and will not work on such requests:
Each potential source of capture is treated by this API as a discrete media source. However, display capture sources MUST NOT be enumerated by enumerateDevices, since this would reveal too much information about the host system.
Display capture sources therefore cannot be selected with the deviceId constraint, since this would allow applications to influence selection; setting deviceId constraint can only cause the resulting MediaStreamTrack to become overconstrained.
So, once again even if FF does not implement this API yet, we can assume they do follow this same rule for their current implementation, for the same reasons.
However, what will apparently be possible to do when this API will come to life, is to use the "browser" constraint, instead of "window". While the specs are not really clear as to what it is exactly ("a browser display surface, or single browser window"), I guess it will be closer to what you want than "window", and someone even asked 2 days ago for a "tab" constraint, could we even hope for a "current-tab" constraint? That might need someone to open an issue on w3c github's page.
In WebRTC we have MediaStream and MediaStreamTrack interfaces.
MediaStreamTrack represents a audio or video stream of a media source. So a consumer like video or audio tag can simply take an MediaStreamTrack object and and get the stream from it. So what is the need for MediaStream interface?
According to official documentation, MediaStream synchronises one or more tracks. Does that mean it combines multiple streams from tracks and produces a single stream so that we have video data with audio?
For example: Does a video tag read the stream from MediaStream object or reads streams from the individual tracks?
This concept is not explained clearly anywhere.
Thanks in advance.
MediaStream has devolved into a simple container for tracks, representing video and audio together (a quite common occurrence).
It doesn't "combine" anything, it's merely a convenience keeping pieces together that need to be played back in time synchronization with each other. No-one likes lips getting out of sync with spoken words.
It's not even really technically needed, but it's a useful semantic in APIs for:
Getting output from hardware with camera and microphone (usually video and audio), and
Connecting it (the output) to a sinc, like the html video tag (which accepts video and audio).
Reconstituting audio and video at the far end of an RTCPeerConnection that belong together, in the sense that they should generally be played in sync with each other (browsers have more information about expectations on the far end this way, e.g. if packets from one track are lost but not the other).
Whether this is a useful abstraction may depend on the level of detail you're interested in. For instance the RTCPeerConnection API, which is still in Working Draft stage, has over the last year moved away from streams as inputs and outputs to dealing directly with tracks, because the working group believes that details matter more when it comes to transmission (things like tracking bandwidth use etc.)
In any case, going from one to the other will be trivial:
var tracks = stream.getTracks();
console.log(tracks.map(track => track.kind)); // audio,video
video.srcObject = new MediaStream(tracks);
once browsers implement the MediaStream constructor (slated for Firefox 44).
This is sort of expanding on my previous question Web Audio API- onended event scope, but I felt it was a separate enough issue to warrant a new thread.
I'm basically trying to do double buffering using the web audio API in order to get audio to play with low latency. The basic idea is we have 2 buffers. Each is written to while the other one plays, and they keep playing back and forth to form continuous audio.
The solution in the previous thread works well enough as long as the buffer size is large enough, but latency takes a bit of a hit, as the smallest buffer I ended up being able to use was about 4000 samples long, which at my chosen sample rate of 44.1k would be about 90ms of latency.
I understand that from the previous answer that the issue is in the use of the onended event, and it has been suggested that a ScriptProcessorNode might be of better use. However, it's my understanding that a ScriptProcessorNode has its own buffer of a certain size that is built-in which you access whenever the node receives audio and which you determine in the constructor:
var scriptNode = context.createScriptProcessor(4096, 2, 2); // buffer size, channels in, channels out
I had been using two alternating source buffers initially. Is there a way to access those from a ScriptProcessorNode, or do I need to change my approach?
No, there's no way to use other buffers in a scriptprocessor. Today, your best approach would be to use a scriptprocessor and write the samples directly into there.
Note that the way AudioBuffers work, you're not guaranteed in your previous approach to not be copying and creating new buffers anyway - you can't simultaneously be accessing a buffer from the audio thread and the main thread.
In the future, using an audio worker will be a bit better - it will avoid some of the thread-hopping - but if you're (e.g.) streaming buffers down from a network source, you won't be able to avoid copying. (It's not that expensively, actually.) If you're generating the audio buffer, you should generate it in the onaudioprocess.
I am using html5 web audio api in my application. Application is simple, I have
BufferSourceNode -> GainNode -> lowpass filter -> context.destination
Now I want to save the output after applying the filters. So I decided to add recorder before
context.destination. But this doesn't work, it gives some noise sound while playing the audio, though my recorder records filter effects successfully.
Am I doing it in right way or is there any better way to do this?
Two things:
1) if you are going to use the buffer anyway - even if you're not() - you might want to consider using an OfflineAudioContext (https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#OfflineAudioContext-section). OACs can run faster than real-time, so you don't need to "record" it in real-time; you set up your nodes, call startRendering(), and the oncomplete event hands you an audiobuffer. () If you still want a .WAV file, you can pull the WAV encoding function out of Recordjs and use it to encode an arbitrary buffer.
2) That sounds like an error in your code - it should work either way, without causing extra noise. Do you have a code sample you can send me?