Playing Live Audio Stream in HTML5 - MediaSource Errors in Chrome

Playing Live Audio Stream in HTML5 - MediaSource Errors in Chrome - javascript

I need a way to play a live audio stream using HTML5 (primarily in Google Chrome), so I tried using the following:
<audio>
<source src="my-live-stream.ogg" type="audio/ogg">
</audio>
While this does work for a live stream, there seems to be a very large, uncontrollable delay/buffer of around 30 seconds or so when this is played.
I need the delay to be a couple of seconds or less so this method doesn't work.
As an alternative I have tried sending the audio over a WebSocket connection as 1 second individual audio files, which are then appended to a SourceBuffer and played in an audio element using Media Source Extensions.
After experimenting with a number of formats (MediaSource.isTypeSupported seems to be rather limited in audio support), I got this working using a Vorbis audio stream in a WebM container, which sounds perfect with no audible gaps. Other formats worked less well as they need to be gapless - e.g. MP3 and AAC end up with tiny audible gaps between each 1 second segment.
While this seems to work at first, when looking at chrome://media-internals, the following errors repeatedly appear:
00:00:09 544 info Estimating WebM block duration to be 3ms for the last (Simple)Block in the Cluster for this Track. Use BlockGroups with BlockDurations at the end of each Track in a Cluster to avoid estimation.
00:00:09 585 error Large timestamp gap detected; may cause AV sync to drift. time:8994999us expected:9231000us delta:-236001us
00:01:05 239 debug Skipping splice frame generation: not enough samples for splicing new buffer at 65077997us. Have 1us, but need 1000us.
Eventually the playback stops as though the pause button has been pressed on the audio element. It still shows the pause rather than play button, but the current time stops advancing:
Pressing the pause button and then the play button that replaces it doesn't make it start playing again, but manually dragging the position slider further ahead makes it continue playing.
I have tried setting sourceBuffer.mode = 'sequence'; but this doesn't seem to help.
Is there anything that needs to be changed in how the audio files are being encoded, or how they are played back in JavaScript to fix this?
Additional details:
The audio stream is encoded into 1 second WebM/Vorbis files using FFmpeg on Windows.
A background worker is used in the browser to receive the audio segments and pass them to the main page thread, which appends them to the audio stream. Otherwise the playback freezes.
Source code:
Web player: https://github.com/SamuelFisher/WebSocketAudio
WebSocket server and encoder: https://github.com/SamuelFisher/WebSocketAudioServer

Related

How to record audio and play music on webpage on mobile phone with javascript?

I have a web app that records sound and plays music from an embed youtube player (youtube player api). For the audio, I continuously record by starting the recorder after it ends in an infinite loop (speech recognition class). But I want to also play the youtube video at the same time. On a desktop, they both don't interfere with each other (or at least with a headphone and mic). But on my android phone, everytime it starts recording, it pauses the video and then resumes it when the recording ends. Is there a way to make the video not pause?

You would have to break the stream into separate threads, and hope that you have enough power to handle them all.
I say "all" because while the your music stream may only be carried on one thread, the video from YouTube is still there.
In effect, you have to pretend to be the origininator and produce all of the a/v signals separately, then allow the mobile device to consume them (assuming that the signals are properly encoded to begin with).

Make an audio sprite without loading the complete song

I'm using Howler JS to play songs on a website. I want just a portion of the song to be played.
Im making a sprite of each mp3 and those sprites can be played. However, it takes really long before the audio plays. It's like the whole mp3 is downloaded first and then the sprite begins, which really decrease performances and consume bandwidth.
Im not familiar with Howler at all, maybe there's a method to download just the portion to be played, or if not, is there any other library/ ways to accomplish this ?
<div
className="playExtrait"
onClick={() => {
Howler.unload();
let song = new Howl({
src: [url],
html5: true,
sprite: {
extrait: [0, 30000]
}
});
let songID = song.play("extrait");
setPlayPause("playing");
song.fade(1, 0, 30000, songID);
song.on("end", () => {
setPlayPause("paused");
});
}}
>

You can create recordings of each specific time slices of the media by using Media Fragments URI, for example, by setting src of a <audio> element to /path/to/media#t=10,15 for playback of 10 through 15 seconds of the media resource and MediaRecorder to record the playback and save the recording as a .webm media file, where MediaRecorder is stopped at pause event of HTMLMediaElement.
See
How to edit (trim) a video in the browser?
How to get a precise timeupdate on a video to return upto 2 decimal numbers (milliseconds)?
Javascript - Seek audio to certain position when at exact position in audio track
How to use Blob URL, MediaSource or other methods to play concatenated Blobs of media fragments??
For an example of concatenating multiple media fragments into a single recording see MediaFragmentRecorder (am the author of the code at the repository). MediaSource at Chromium/Chrome has issues when MediaRecorder is used to record a MediaSource stream, though the code should still produce the expected result at Firefox.

How to keep a live MediaSource video stream in-sync?

I have a server application which renders a 30 FPS video stream then encodes and muxes it in real-time into a WebM Byte Stream.
On the client side, an HTML5 page opens a WebSocket to the server, which starts generating the stream when connection is accepted. After the header is delivered, each subsequent WebSocket frame consists of a single WebM SimpleBlock. A keyframe occurs every 15 frames and when this happens a new Cluster is started.
The client also creates a MediaSource, and on receiving a frame from the WS, appends the content to its active buffer. The <video> starts playback immediately after the first frame is appended.
Everything works reasonably well. My only issue is that the network jitter causes the playback position to drift from the actual time after a while. My current solution is to hook into the updateend event, check the difference between the video.currentTime and the timecode on the incoming Cluster and manually update the currentTime if it falls outside an acceptable range. Unfortunately, this causes a noticeable pause and jump in the playback which is rather unpleasant.
The solution also feels a bit odd: I know exactly where the latest keyframe is, yet I have to convert it into a whole second (as per the W3C spec) before I can pass it into currentTime, where the browser presumably has to then go around and find the nearest keyframe.
My question is this: is there a way to tell the Media Element to always seek to the latest keyframe available, or keep the playback time synchronised with the system clock time?

network jitter causes the playback position to drift
That's not your problem. If you are experiencing drop-outs in the stream, you aren't buffering enough before playback to begin with, and playback just has an appropriately sized buffer, even if a few seconds behind realtime (which is normal).
My current solution is to hook into the updateend event, check the difference between the video.currentTime and the timecode on the incoming Cluster
That's close to the correct method. I suggest you ignore the timecode of incoming cluster and instead inspect your buffered time ranges. What you've received on the WebM cluster, and what's been decoded are two different things.
Unfortunately, this causes a noticeable pause and jump in the playback which is rather unpleasant.
How else would you do it? You can either jump to realtime, or you can increase playback speed to catch up to realtime. Either way, if you want to catch up to realtime, you have to skip in time to do that.
The solution also feels a bit odd: I know exactly where the latest keyframe is
You may, but the player doesn't until that media is decoded. In any case, keyframe is irrelevant... you can seek to non-keyframe locations. The browser will decode ahead of P/B-frames as required.
I have to convert it into a whole second (as per the W3C spec) before I can pass it into currentTime
That's totally false. The currentTime is specified as a double. https://www.w3.org/TR/2011/WD-html5-20110113/video.html#dom-media-currenttime
My question is this: is there a way to tell the Media Element to always seek to the latest keyframe available, or keep the playback time synchronised with the system clock time?
It's going to play the last buffer automatically. You don't need to do anything. You're doing your job by ensuring media data lands in the buffer and setting playback as close to that as reasonable. You can always advance it forward if a network condition changes that allows you to do this, but frankly it sounds as if you just have broken code and a broken buffering strategy. Otherwise, playback would be simply smooth.
Catching up if fallen behind is not going to happen automatically, and nor should it. If the player pauses due to the buffer being drained, a buffer needs to be built back up again before playback can resume. That's the whole point of the buffer.
Furthermore, your expectation of keeping anything in-time with the system clock is not a good idea and is unreasonable. Different devices have different refresh rates, will handle video at different rates. Just hit play and let it play. If you end up being several seconds off, go ahead and set currentTime, but be very confident of what you've buffered before doing so.

How to fully buffer chrome video?

While creating a video player using HTML5 video tag I have noticed undesirable behavior in Google Chrome. When I pause video buffering starts, and when I play buffering stops. As a result I get undesirable user experience.
I'm using large video files about 2-4 GB in size. And often, when I seek to some position and monitor buffered ranges I notice chrome buffers wrong buffer range. If I choose to pause player, chrome continues to buffer wrong buffer range and never buffers range currentTime is in and the one player is monitoring.
Another problem is that, since I choose to play video in background and hide viewer from noticing under other DOM elements, so I can force chrome to buffer while on playback. When is played / buffered enough I seek video back few seconds. Once I do this chrome stops buffering and my buffered range is quickly played, and process starts once again leaving bad user experience.
Is this a known issue, or am I doing something wrong? Is there any workaround to make Google Chrome buffering continue and not to stop?

Multiple audio tracks for HTML5 video

I'm building a video for my website with HTML5. Ideally, I'd have only one silent video file, and five different audio tracks in different languages that sync up with the video.
Then I'd have a button that allows users to switch between audio tracks, even as the video is playing; and the correct audio track would come to life (without the video pausing or starting over or anything; much like a DVD audio track selection).
I can do this quite simply in Flash, but I don't want to. There has to be a way to do this in pure HTML5 or HTML5+jQuery. I'm thinking you'd play all the audio files at 0 volume, and only increase the volume of the active track... but I don't know how to even do that, let alone handle it when the user pauses or rewinds the video...
Thanks in advance!

Synchronization between audio and video is far more complex than simply starting the audio and video at the same time. Sound cards will playback at slightly different rates. (What is 44.1 kHz to me, might actually be 44.095 kHz to you.)
Often, the video is synchronized to the audio stream, but the player is what handles this. By loading up multiple objects for playback, you are effectively pulling them out of sync.
The only way this is going to work is if you can find a way to control the different audio streams from the video player. I don't know if this is possible.
Instead, I propose that you encode the video multiple times, with the different streams. You can use FFMPEG for this, and even automate the process, depending on your workflow. Switching among streams becomes a problem, but most video players are robust enough to guess the byte offset in the file, when given the bitrate.
If you only needed two languages, you could simply adjust the balance between a left and right stereo audio channel.

If you're willing to let all five tracks download, why not just mux them into the video? Videos are not limited to a single audio track (even AVI could do multiple audio tracks). Then syncing should be handled for you. You'd just enable/disable the audio tracks as needed.

It is doable with Web Audio API.
Part of my program listens to video element events and stops or restarts audio tracks created using web audio API. This gives me an ability to turn on and off any of the tracks in perfect sync.
There are some drawbacks.
There is no Web Audio API support in Internet Explorers except for Edge.
The technique works with buffered audio only and that's limiting. There are some problems with large files: https://bugs.chromium.org/p/chromium/issues/detail?id=71704

We Keep Coding

JavaScript is the programming language of the Web.