Make an audio sprite without loading the complete song - javascript

I'm using Howler JS to play songs on a website. I want just a portion of the song to be played.
Im making a sprite of each mp3 and those sprites can be played. However, it takes really long before the audio plays. It's like the whole mp3 is downloaded first and then the sprite begins, which really decrease performances and consume bandwidth.
Im not familiar with Howler at all, maybe there's a method to download just the portion to be played, or if not, is there any other library/ ways to accomplish this ?
<div
className="playExtrait"
onClick={() => {
Howler.unload();
let song = new Howl({
src: [url],
html5: true,
sprite: {
extrait: [0, 30000]
}
});
let songID = song.play("extrait");
setPlayPause("playing");
song.fade(1, 0, 30000, songID);
song.on("end", () => {
setPlayPause("paused");
});
}}
>

You can create recordings of each specific time slices of the media by using Media Fragments URI, for example, by setting src of a <audio> element to /path/to/media#t=10,15 for playback of 10 through 15 seconds of the media resource and MediaRecorder to record the playback and save the recording as a .webm media file, where MediaRecorder is stopped at pause event of HTMLMediaElement.
See
How to edit (trim) a video in the browser?
How to get a precise timeupdate on a video to return upto 2 decimal numbers (milliseconds)?
Javascript - Seek audio to certain position when at exact position in audio track
How to use Blob URL, MediaSource or other methods to play concatenated Blobs of media fragments??
For an example of concatenating multiple media fragments into a single recording see MediaFragmentRecorder (am the author of the code at the repository). MediaSource at Chromium/Chrome has issues when MediaRecorder is used to record a MediaSource stream, though the code should still produce the expected result at Firefox.

Related

Under what situation would muting an <audio> element also affect its srcObject's connected nodes?

I realize that the title may be a little confusing, so let me give some context.
I'm writing an extension to do some audio manipulation with Google Meet, and, after studying its behavior, I found a weird issue that I can't seem to wrap my head around.
Google Meet seems to use three <audio> elements to play audio, with each one having their own MediaStreams. Through some testing, it seems that:
Muting the <audio> element stops Google Meet's audio visualizations as to who is talking.
Swapping the .srcObject properties of two audio elements and then calling .play() on them does not affect Google Meet's audio visualizations.
These seem to point to Google Meet connecting the source MediaStream into its audio processing graph to create the visualizations rather than capturing the <audio> element, since I can swap MediaStreams without affecting the visualizations.
However, one more thing that I noticed seem to make no sense considering the past information:
Adding a new MediaStreamAudioSourceNode from the .srcObject of the <audio> element and connecting it to an AnalyserNode showed that, even when I mute the <audio> element, I can still analyse the audio being played through the MediaStream.
Here's some example code and outputs done through the browser console:
ac = new AudioContext();
an = ac.createAnalyser()
sn = ac.createMediaStreamSource(document.querySelectorAll("audio")[0].srcObject)
sn.connect(an)
function analyse(aNode) {
const ret = new Float32Array(aNode.frequencyBinCount);
aNode.getFloatTimeDomainData(ret);
return ret;
}
analyse(an)
// > Float32Array(1024) [ 0.342987060546875, 0.36688232421875, 0.37115478515625, 0.362457275390625, 0.35150146484375, 0.3402099609375, 0.321075439453125, 0.308746337890625, 0.29779052734375, 0.272552490234375, … ]
document.querySelectorAll("audio")[0].muted = true
analyse(an)
// > Float32Array(1024) [ -0.203582763671875, -0.258026123046875, -0.31134033203125, -0.34375, -0.372802734375, -0.396484375, -0.3919677734375, -0.36328125, -0.31689453125, -0.247650146484375, … ]
// Here, I mute the microphone on *my end* through Google Meet.
analyse(an)
// > Float32Array(1024) [ -0.000030517578125, 0, 0, -0.000030517578125, -0.000091552734375, -0.000091552734375, -0.000091552734375, -0.00006103515625, 0, 0.000030517578125, … ]
// The values here are much closer to zero.
As you can see, when the audio element is muted, the AnalyserNode can still pick up on the audio, but Meet's visualizations break. That is what I don't understand. How can that be?
How can a connected AnalyserNode not break when the <audio> element is muted, but something else is, without using .captureStream()?
Another weird thing is that it only happens on Chrome. On Firefox, Meet's visualizations work fine even if the audio element is muted. I think this might be related to a known Chrome issue where MediaStreams require a playing <audio> element to output anything to the audio graph (https://stackoverflow.com/a/55644983), but I can't see how that would affect a muted <audio> element.
It's a bit confusing but the behavior of AudioElement.captureStream() is actually different from using a MediaElementAudioSourceNode.
new MediaStreamAudioSourceNode(audioContext, audioElement.captureStream());
// is not equal to
new MediaElementAudioSourceNode(audioContext, audioElement);
The stream obtained by calling AudioElement.captureStream() is not affected by any volume changes on the audio element. Calling AudioElement.captureStream() will also not change the volume of the audio element itself.
However using a MediaElementAudioSourceNode will re-route the audio of an audio element into an AudioContext. The audio will be affected by any volume changes that are made to the audio element. Which means muting the audio element will result in muting the audio that gets piped into the AudioContext.
On top of that using a MediaElementAudioSourceNode will make the audio element itself silent.
I assume Google Meet uses a MediaElementAudioSourceNode for each audio element to process the audio.

HTML5 Canvas Recording Crashing onstop new Blob recordedChunks

I am recording an HTML5 Canvas Stream using MediaRecorder. Stream to MediaRecorder is a mixed stream - it is a mixture of (1) canvas.captureStream(30) (2) Audio Stream (since the canvas animation has an audio on the page)
When the recording is longer than, say 1 minute, the chrome tab crashes after the line:
var blob = new Blob(recordedChunks, { 'type' : 'video/mp4' });
When it is less, say 10 seconds, the crash does not occur.
The resulting video is big in dimensions. Not sure if that is the issue. My canvas animation is mostly images and mp4 being played in a sequence (think of it as a slide show)
How can I fix this crash? Even when there is not a crash it takes a long time for new Blob to complete before I get the final video.

Interactive video using Media Source Extension API

Background
I am trying to create an interactive video player using the HTML5 video element.
I came across Media Source Extensions API, and got it working.
var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);
sourceBuffer = mediaSource.addSourceBuffer(mime);
Next I make a REST call to fetch the video segment and attach it to the source buffer.
sourceBuffer.appendBuffer(arrayBuffer);
Problem
I am trying to fire events when the newly loaded video segment reaches a certain time.
Also, I want to be able to seamlessly loop a video segment, which exits and continues to another video segment on interaction.
How can I achieve these features?

Playing Live Audio Stream in HTML5 - MediaSource Errors in Chrome

I need a way to play a live audio stream using HTML5 (primarily in Google Chrome), so I tried using the following:
<audio>
<source src="my-live-stream.ogg" type="audio/ogg">
</audio>
While this does work for a live stream, there seems to be a very large, uncontrollable delay/buffer of around 30 seconds or so when this is played.
I need the delay to be a couple of seconds or less so this method doesn't work.
As an alternative I have tried sending the audio over a WebSocket connection as 1 second individual audio files, which are then appended to a SourceBuffer and played in an audio element using Media Source Extensions.
After experimenting with a number of formats (MediaSource.isTypeSupported seems to be rather limited in audio support), I got this working using a Vorbis audio stream in a WebM container, which sounds perfect with no audible gaps. Other formats worked less well as they need to be gapless - e.g. MP3 and AAC end up with tiny audible gaps between each 1 second segment.
While this seems to work at first, when looking at chrome://media-internals, the following errors repeatedly appear:
00:00:09 544 info Estimating WebM block duration to be 3ms for the last (Simple)Block in the Cluster for this Track. Use BlockGroups with BlockDurations at the end of each Track in a Cluster to avoid estimation.
00:00:09 585 error Large timestamp gap detected; may cause AV sync to drift. time:8994999us expected:9231000us delta:-236001us
00:01:05 239 debug Skipping splice frame generation: not enough samples for splicing new buffer at 65077997us. Have 1us, but need 1000us.
Eventually the playback stops as though the pause button has been pressed on the audio element. It still shows the pause rather than play button, but the current time stops advancing:
Pressing the pause button and then the play button that replaces it doesn't make it start playing again, but manually dragging the position slider further ahead makes it continue playing.
I have tried setting sourceBuffer.mode = 'sequence'; but this doesn't seem to help.
Is there anything that needs to be changed in how the audio files are being encoded, or how they are played back in JavaScript to fix this?
Additional details:
The audio stream is encoded into 1 second WebM/Vorbis files using FFmpeg on Windows.
A background worker is used in the browser to receive the audio segments and pass them to the main page thread, which appends them to the audio stream. Otherwise the playback freezes.
Source code:
Web player: https://github.com/SamuelFisher/WebSocketAudio
WebSocket server and encoder: https://github.com/SamuelFisher/WebSocketAudioServer

Multiple audio tracks for HTML5 video

I'm building a video for my website with HTML5. Ideally, I'd have only one silent video file, and five different audio tracks in different languages that sync up with the video.
Then I'd have a button that allows users to switch between audio tracks, even as the video is playing; and the correct audio track would come to life (without the video pausing or starting over or anything; much like a DVD audio track selection).
I can do this quite simply in Flash, but I don't want to. There has to be a way to do this in pure HTML5 or HTML5+jQuery. I'm thinking you'd play all the audio files at 0 volume, and only increase the volume of the active track... but I don't know how to even do that, let alone handle it when the user pauses or rewinds the video...
Thanks in advance!
Synchronization between audio and video is far more complex than simply starting the audio and video at the same time. Sound cards will playback at slightly different rates. (What is 44.1 kHz to me, might actually be 44.095 kHz to you.)
Often, the video is synchronized to the audio stream, but the player is what handles this. By loading up multiple objects for playback, you are effectively pulling them out of sync.
The only way this is going to work is if you can find a way to control the different audio streams from the video player. I don't know if this is possible.
Instead, I propose that you encode the video multiple times, with the different streams. You can use FFMPEG for this, and even automate the process, depending on your workflow. Switching among streams becomes a problem, but most video players are robust enough to guess the byte offset in the file, when given the bitrate.
If you only needed two languages, you could simply adjust the balance between a left and right stereo audio channel.
If you're willing to let all five tracks download, why not just mux them into the video? Videos are not limited to a single audio track (even AVI could do multiple audio tracks). Then syncing should be handled for you. You'd just enable/disable the audio tracks as needed.
It is doable with Web Audio API.
Part of my program listens to video element events and stops or restarts audio tracks created using web audio API. This gives me an ability to turn on and off any of the tracks in perfect sync.
There are some drawbacks.
There is no Web Audio API support in Internet Explorers except for Edge.
The technique works with buffered audio only and that's limiting. There are some problems with large files: https://bugs.chromium.org/p/chromium/issues/detail?id=71704

Categories