Interactive video using Media Source Extension API - javascript

Background
I am trying to create an interactive video player using the HTML5 video element.
I came across Media Source Extensions API, and got it working.
var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);
sourceBuffer = mediaSource.addSourceBuffer(mime);
Next I make a REST call to fetch the video segment and attach it to the source buffer.
sourceBuffer.appendBuffer(arrayBuffer);
Problem
I am trying to fire events when the newly loaded video segment reaches a certain time.
Also, I want to be able to seamlessly loop a video segment, which exits and continues to another video segment on interaction.
How can I achieve these features?

Related

Under what situation would muting an <audio> element also affect its srcObject's connected nodes?

I realize that the title may be a little confusing, so let me give some context.
I'm writing an extension to do some audio manipulation with Google Meet, and, after studying its behavior, I found a weird issue that I can't seem to wrap my head around.
Google Meet seems to use three <audio> elements to play audio, with each one having their own MediaStreams. Through some testing, it seems that:
Muting the <audio> element stops Google Meet's audio visualizations as to who is talking.
Swapping the .srcObject properties of two audio elements and then calling .play() on them does not affect Google Meet's audio visualizations.
These seem to point to Google Meet connecting the source MediaStream into its audio processing graph to create the visualizations rather than capturing the <audio> element, since I can swap MediaStreams without affecting the visualizations.
However, one more thing that I noticed seem to make no sense considering the past information:
Adding a new MediaStreamAudioSourceNode from the .srcObject of the <audio> element and connecting it to an AnalyserNode showed that, even when I mute the <audio> element, I can still analyse the audio being played through the MediaStream.
Here's some example code and outputs done through the browser console:
ac = new AudioContext();
an = ac.createAnalyser()
sn = ac.createMediaStreamSource(document.querySelectorAll("audio")[0].srcObject)
sn.connect(an)
function analyse(aNode) {
const ret = new Float32Array(aNode.frequencyBinCount);
aNode.getFloatTimeDomainData(ret);
return ret;
}
analyse(an)
// > Float32Array(1024) [ 0.342987060546875, 0.36688232421875, 0.37115478515625, 0.362457275390625, 0.35150146484375, 0.3402099609375, 0.321075439453125, 0.308746337890625, 0.29779052734375, 0.272552490234375, … ]
document.querySelectorAll("audio")[0].muted = true
analyse(an)
// > Float32Array(1024) [ -0.203582763671875, -0.258026123046875, -0.31134033203125, -0.34375, -0.372802734375, -0.396484375, -0.3919677734375, -0.36328125, -0.31689453125, -0.247650146484375, … ]
// Here, I mute the microphone on *my end* through Google Meet.
analyse(an)
// > Float32Array(1024) [ -0.000030517578125, 0, 0, -0.000030517578125, -0.000091552734375, -0.000091552734375, -0.000091552734375, -0.00006103515625, 0, 0.000030517578125, … ]
// The values here are much closer to zero.
As you can see, when the audio element is muted, the AnalyserNode can still pick up on the audio, but Meet's visualizations break. That is what I don't understand. How can that be?
How can a connected AnalyserNode not break when the <audio> element is muted, but something else is, without using .captureStream()?
Another weird thing is that it only happens on Chrome. On Firefox, Meet's visualizations work fine even if the audio element is muted. I think this might be related to a known Chrome issue where MediaStreams require a playing <audio> element to output anything to the audio graph (https://stackoverflow.com/a/55644983), but I can't see how that would affect a muted <audio> element.
It's a bit confusing but the behavior of AudioElement.captureStream() is actually different from using a MediaElementAudioSourceNode.
new MediaStreamAudioSourceNode(audioContext, audioElement.captureStream());
// is not equal to
new MediaElementAudioSourceNode(audioContext, audioElement);
The stream obtained by calling AudioElement.captureStream() is not affected by any volume changes on the audio element. Calling AudioElement.captureStream() will also not change the volume of the audio element itself.
However using a MediaElementAudioSourceNode will re-route the audio of an audio element into an AudioContext. The audio will be affected by any volume changes that are made to the audio element. Which means muting the audio element will result in muting the audio that gets piped into the AudioContext.
On top of that using a MediaElementAudioSourceNode will make the audio element itself silent.
I assume Google Meet uses a MediaElementAudioSourceNode for each audio element to process the audio.

How to show Video Stream Tracks from two different streams in one HTMLMediaElement (from camera and WebGL stream)

I need to show video from the users camera and virtual objects created with WebGL on a web page in a single html element, probably, either <canvas> or <video>.
I can successfully get user video, it is a stream, from navigator's media devices and show it in <video> element.
I can successfully create virtual visual objects with WebGL and show them in a <canvas> element, by using other's example code here (from MDN).
I need to mix them on a single html element. How can I achieve that.
My further research shows me that there is a captureStream() method of HTMLMediaElement interface. Both <video> and canvas have this method. I can capture the stream from such elements and use it for something else, like attaching into another html element (but not into a canvas element probably) or a WebRTC Peer Connection as source, recording it. But this overwrites the previous stream.
Then I have found that a stream, called MediaStream, has tracks inside them, like video tracks, audio tracks even text tracks. And more can be added by addTrack method of the MediaStream, and they can be gotten by getTracks method. I have added the video track from my <canvas> element's stream to the <video> elements stream, however, I can only view the original video track from the user media in the <video> element.
What am I missing to achieve that?
<html>
<body>
getting video stream from camera into screen
<video autoplay></video>
getting virtual objects into screen
<canvas id="glcanvas" width="640" height="480"></canvas>
</body>
// webgl codes that draws a rotating colored cube on html canvas webgl context
<script src="gl-matrix.js"></script>
<script src="webgl-demo.js"></script>
<script>
// getting video stream from camera into screen
const video = document.querySelector("video");
navigator.mediaDevices.getUserMedia({video: true})
.then(stream => {
let canv = document.querySelector("#glcanvas");
let canvstrm = canv.captureStream();
// get track from the canvas stream and add to the user media stream
let canvstrmtrack = canvstrm.getTracks()[0]
stream.addTrack(canvstrmtrack);
video.srcObject = stream;
})
</script>
</html>
Complete gist
A video element can only play a single video track at a time.
Support for this is found in the MediaCapture spec:
Since the order in the MediaStream's track set is undefined, no requirements are put on how the AudioTrackList and VideoTrackList is ordered
And in HTML:
Either zero or one video track is selected; selecting a new track while a previous one is selected will unselect the previous one.
It sounds like you expected the graphics to overlay the video, with e.g. black as transparent.
There are no good video composition tools in the web platform at the moment. About the only option is to repeatedly draw frames from the video element into a canvas, and use canvas.captureStream, but it is resource intensive and slow.
If you are merely doing this for playback (not recording or transmitting the result), then you might be able to achieve this effect much more cheaply by positioning a canvas with transparency on top of the video element using CSS. This approach also avoids cross-origin restrictions.

Capture the html canvas for processing

I am running small projects that requires to capture and analyse the content of the canvas.
It is an agent which plays google no-internet dinosaur game.
I can access content of the canvas from a console with:
canvas = document.getElementById("gamecanvas");
context = canvas.getContext("2d");
imgData = context.getImageData(0,0,600,150);
But I have been trying to use HTMLCanvasElement.captureStream() to generate the event at a given framerate or whenever the canvas changes.
But when I implement it as:
const canvas = document.getElementById("gamecanvas");
const stream = canvas.captureStream(25)
stream.onaddtrack = function(event) { console.log("Called") }
I would expect the console.log("Called") to be called 25 times per second, but nothing gets called. Have I misunderstood something about the streams?
HTMLCanvasElement.captureStream returns a MediaStream. This MediaStream is initially composed of a special kind of MediaStreamTrack: a CanvasCaptureMediaStreamTrack which is simply a special video track with a link to the original HTMLCanvasElement.
This may still sound foreign language at this stage...
A MediaStream is a container object holding tracks themselves holding a stream of raw data, which are part of a media. An audio track will hold a stream of raw audio data, a video or canvas one will hold a stream of raw video data.
Tracks can be added or removed from a MediaStream, so that the MediaStream that was fed by a webcam's video be changed to a video coming webRTC etc. This is what the the onaddtrack event monitors: when a MediaStreamTrack is added to the MediaStream container.
It has nothing to do with frames being appended to the video stream, for the MediaStream, it is either streaming or paused.
So your MediaStream holds a stream of video data, generated from the canvas current state.
Captured stream from canvas have this special that you can require at which maximum frequency the browser should append new frames to
the video stream. However this is just a maximum; if nothing new has been painted on the canvas, then no image will get appended, and the stream will continue to display the last image that got appended.
I don't think there is any way to know when this operation happens, but even if there was one, your process would be too much convoluted.
draw on canvas1
capture stream
render stream in <video>
draw <video> on canvas2
process the image drawn on canvas2
While all you need is
draw on canvas1
process the image drawn on canvas1
If you want to do it at a certain frame-rate, then set up a timeout loop.

Make an audio sprite without loading the complete song

I'm using Howler JS to play songs on a website. I want just a portion of the song to be played.
Im making a sprite of each mp3 and those sprites can be played. However, it takes really long before the audio plays. It's like the whole mp3 is downloaded first and then the sprite begins, which really decrease performances and consume bandwidth.
Im not familiar with Howler at all, maybe there's a method to download just the portion to be played, or if not, is there any other library/ ways to accomplish this ?
<div
className="playExtrait"
onClick={() => {
Howler.unload();
let song = new Howl({
src: [url],
html5: true,
sprite: {
extrait: [0, 30000]
}
});
let songID = song.play("extrait");
setPlayPause("playing");
song.fade(1, 0, 30000, songID);
song.on("end", () => {
setPlayPause("paused");
});
}}
>
You can create recordings of each specific time slices of the media by using Media Fragments URI, for example, by setting src of a <audio> element to /path/to/media#t=10,15 for playback of 10 through 15 seconds of the media resource and MediaRecorder to record the playback and save the recording as a .webm media file, where MediaRecorder is stopped at pause event of HTMLMediaElement.
See
How to edit (trim) a video in the browser?
How to get a precise timeupdate on a video to return upto 2 decimal numbers (milliseconds)?
Javascript - Seek audio to certain position when at exact position in audio track
How to use Blob URL, MediaSource or other methods to play concatenated Blobs of media fragments??
For an example of concatenating multiple media fragments into a single recording see MediaFragmentRecorder (am the author of the code at the repository). MediaSource at Chromium/Chrome has issues when MediaRecorder is used to record a MediaSource stream, though the code should still produce the expected result at Firefox.

Load multiple videos without playing

I'm building a video player where each scene is filmed from multiple angles. All videos are hosted on YouTube. I'd like to allow the user to be able to switch between angles seamlessly during playback.
To facilitate this, I need a way to load videos from YouTube without playing them. That way I can load alternate angles in the background while one angle is playing. When the user switches angle, the new angle should be at least partially loaded and ready to play immediately.
Unfortunately, I can't find a way to load a video without playing it.
The loadVideoById method autoplays the video as soon as the request to load the video has returned so that won't work.
Is this possible?
There's no way to cue up a video and force it to pre-buffer.
You can load (as opposed to cue) a video and then immediately pause it, and it may or may not pre-buffer, but that's dependent on a number of factors and is outside your control as someone using the API.

Categories