MediaRecorder ignoring VideoFrame.timestamp

MediaRecorder ignoring VideoFrame.timestamp - javascript

I would like to generate a video. I am using MediaRecorder to record a track generated by MediaStreamTrackGenerator.
Generating each frame takes some time, let's say 1 second, and I would like to generate the video at 10 fps.
Therefore, when I create a frame, I use timestamp and duration to indicate the real time of the frame.
const ms = 1_000_000; // 1µs
const fps = 10;
const frame = new VideoFrame(await createImageBitmap(canvas), {
timestamp: (ms * 1) / fps,
duration: ms / fps,
});
Unfortunately, if generating each frame takes 1 second, despite indicating timestamp and duration, the video is played at 1frame/sec, not 10fps.
How can I encode the video frames at the desired frame rate?
Bonus: Downloading the generated video in VLC, the video has no duration. Can this be set?
CodePen for reproduction: https://codepen.io/AmitMY/pen/OJxgPoG
(this example works in Chrome. If you use Safari, change video/webm to video/mp4.)
Things I tried and aren't a good solution for me:
Storing all frames in some cache, then playing them back at the desired speed and recording that playback. It is unreliable, inconsistent, and memory intensive.

Foreword
So... I've been investigating this for two days now and it's a complete mess. I don't have a full answer, but here's what I've tried and figured out so far.
The situation
First up I scrapped up this diagram of Web Codecs / Insertable Streams API to better understand how everything links together:
MediaStream, StreamTrack, VideoFrame, TrackProcessor, TrackGenerator, ...
The most common use case / flow is that you have a MediaStream, such as a video camera feed or an existing video (playing on canvas), which you'd then "break into" different MediaStreamTracks - usually audio- and video track, though the API actually supports subtitle-, image- and shared screen tracks as well.
So you break a MediaStream into a MediaStreamTrack of "video" kind, which you then feed to MediaStreamTrackProcessor to actually break the video track into individual VideoFrames. You can then do frame-by-frame manipulation and when you're done, you're supposed to stream those VideoFrames into MediaStreamTrackGenerator, which in turn turns those VideoFrames into a MediaStreamTrack, which in turn you can stuff into a MediaStream to make a sort of "Full Media Object" aka. something that contains Video and Audio tracks.
Interestingly enough, I couldn't get a MediaStream to play on a <video> element directly, but I think that this is a hard requirement if we want to accomplish what OP wants.
As it currently stands, even when we have all the VideoFrames ready to go and turned into a MediaStream, we still have to, for some reason, record it twice to create a proper Blob which <video> accepts - think of this step pretty much as a "rendering" step of a professional video editing software, the only difference being that we already have the final frames, so why can't we just create a video out of those?
As far as I know, everything here that works for Video, also works for Audio. So there actually exist something called AudioFrame for example, though the documentation page is missing as I am writing this.
Encoding and Decoding
Furthermore, regarding VideoFrames and AudioFrames, there's also API support for encoding and decoding of those, which I actually tried in the hopes that encoding a VideoFrame with VP8 would somehow "bake" that duration and timestamp into it, as at least the duration of VideoFrame does not seem to do anything.
Here's my encoding / decoding code when I tried playing around with it. Note that this whole encoding and decoding business + codecs is one hell of a deep rabbit hole. I have no idea how I found this for example, but it did tell me that Chromium doesn't support hardware accelerated VP8 on Windows (no thanks to the codec error messages, which just babbled something about "cannot used closed codec"):
const createFrames = async (ctx, fps, streamWriter, width, height) => {
const getRandomRgb = () => {
var num = Math.round(0xffffff * Math.random());
var r = num >> 16;
var g = num >> 8 & 255;
var b = num & 255;
return 'rgb(' + r + ', ' + g + ', ' + b + ')';
}
const encodedChunks = [];
const videoFrames = [];
const encoderOutput = (encodedChunk) => {
encodedChunks.push(encodedChunk);
}
const encoderError = (err) => {
//console.error(err);
}
const encoder = new VideoEncoder({
output: encoderOutput,
error: encoderError
})
encoder.configure({
//codec: "avc1.64001E",
//avc:{format:"annexb"},
codec: "vp8",
hardwareAcceleration: "prefer-software", // VP8 with hardware acceleration not supported
width: width,
height: height,
displayWidth: width,
displayHeight: height,
bitrate: 3_000_000,
framerate: fps,
bitrateMode: "constant",
latencyMode: "quality"
});
const ft = 1 / fps;
const micro = 1_000_000;
const ft_us = Math.floor(ft * micro);
for(let i = 0; i < 10; i++) {
console.log(`Writing frames ${i * fps}-${(i + 1) * fps}`);
ctx.fillStyle = getRandomRgb();
ctx.fillRect(0,0, width, height);
ctx.fillStyle = "white";
ctx.textAlign = "center";
ctx.font = "80px Arial";
ctx.fillText(`${i}`, width / 2, height / 2);
for(let j = 0; j < fps; j++) {
//console.log(`Writing frame ${i}.${j}`);
const offset = i > 0 ? 1 : 0;
const timestamp = i * ft_us * fps + j * ft_us;
const duration = ft_us;
var frameData = ctx.getImageData(0, 0, width, height);
var buffer = frameData.data.buffer;
const frame = new VideoFrame(buffer,
{
format: "RGBA",
codedWidth: width,
codedHeight: height,
colorSpace: {
primaries: "bt709",
transfer: "bt709",
matrix: "bt709",
fullRange: true
},
timestamp: timestamp,
duration: ft_us
});
encoder.encode(frame, { keyFrame: false });
videoFrames.push(frame);
}
}
//return videoFrames;
await encoder.flush();
//return encodedChunks;
const decodedChunks = [];
const decoder = new VideoDecoder({
output: (frame) => {
decodedChunks.push(frame);
},
error: (e) => {
console.log(e.message);
}
});
decoder.configure({
codec: 'vp8',
codedWidth: width,
codedHeight: height
});
encodedChunks.forEach((chunk) => {
decoder.decode(chunk);
});
await decoder.flush();
return decodedChunks;
}
Frame calculations
Regarding your frame calculations, I did things a bit differently. Consider the following image and code:
const fps = 30;
const ft = 1 / fps;
const micro = 1_000_000;
const ft_us = Math.floor(ft * micro);
Ignoring the fact how long it takes to create 1 frame (as it should be irrelevant here, if we can set the frame duration), here's what I figured.
We want to play the video at 30 frames per second (fps). We generate 10 colored rectangles which we want to show on the screen for 1 second each, resulting in a video length of 10 seconds. This means that, in order to actually play the video at 30fps, we need to generate 30 frames for each rectangle. If we could set a frame duration, we could technically have only 10 frames with a duration of 1 second each, but then the fps would actually be 1 frame per second. We're doing 30fps though.
An fps of 30 gives us a frametime (ft) of 1 / 30 seconds, aka. the time that each frame is shown on the screen. We generate 30 frames for 1 rectangle -> 30 * (1 / 30) = 1 second checks out. The other thing here is that VideoFrame duration and timestamp do not accept seconds or milliseconds, but microseconds, so we need to turn that frametime (ft) to frametime in microseconds (ft_us), which is just (1 / 30) * 1 000 000 = ~33 333us.
Calculating the final duration and timestamp for each frame is a bit tricky as we are now looping twice, one loop for each rectangle and one loop for each frame of a rectangle at 30fps.
The timestamp for a frame j of rectangle i is (in english):
<i> * <frametime in us> * <fps> + <j> * <frametime in us> (+ <offset 0 or 1>
Where <i> * <frametime in us> * <fps> gets us many microseconds each previous rectangle takes and <j> * <frametime in us> gets us how many microseconds each previous frame of the current rectangle takes. We also supply and optional offset of 0, when we're making our very first frame of the very first rectangle and an offset of 1 otherwise, so that we avoid overlapping.
const fps = 30;
const ft = 1 / fps;
const micro = 1_000_000;
const ft_us = Math.floor(ft * micro);
// For each colored rectangle
for(let i = 0; i < 10; i++) {
// For each frame of colored rectangle at 30fps
for(let j = 0; j < fps; j++) {
const offset = i > 0 ? 1 : 0;
const timestamp = i * ft_us * fps + j * ft_us /* + offset */;
const duration = ft_us * 10;
new VideoFrame({ duration, timestamp });
...
}
}
This should get us 10 * 30 = 300 frames in total, for a video length of 10 seconds when played at 30 fps.
My latest try and ReadableStream test
I've refactored everything so many times without luck, but here is my current solution where I try to use ReadableStream to pass the generated VideoFrames to MediaStreamTrackGenerator (skipping the recording step), generate a MediaStream from that and try to give the result to srcObject of a <video> element:
const streamTrackGenerator = new MediaStreamTrackGenerator({ kind: 'video' });
const streamWriter = streamTrackGenerator.writable;
const chunks = await createFrames(ctx, fps, streamWriter, width, height); // array of VideoFrames
let idx = 0;
await streamWriter.ready;
const frameStream = new ReadableStream({
start(controller) {
controller.enqueue(chunks[idx]);
idx++;
},
pull(controller) {
if(idx >= chunks.length) {
controller.close();
}
else {
controller.enqueue(chunks[idx]);
idx++;
}
},
cancel(reason) {
console.log("Cancelled", reason);
}
});
await frameStream.pipeThrough(new TransformStream({
transform (chunk, controller) {
console.log(chunk); // debugging
controller.enqueue(chunk) // passthrough
}
})).pipeTo(streamWriter);
const mediaStreamTrack = streamTrackGenerator.clone();
const mediaStream = new MediaStream([mediaStreamTrack]);
const video = document.createElement('video');
video.style.width = `${width}px`;
video.style.height = `${height}px`;
document.body.appendChild(video);
video.srcObject = mediaStream;
video.setAttribute('controls', 'true')
video.onloadedmetadata = function(e) {
video.play().catch(e => alert(e.message))
};
Try with VP8 encoding + decoding and trying to give VideoFrames to MediaSource via SourceBuffers
More info on MediaSource and SourceBuffers. This one is also me trying to exploit the MediaRecorder.start() function with timeslice parameter in conjuction with MediaRecorder.requestFrame() to try and record frame-by-frame:
const init = async () => {
const width = 256;
const height = 256;
const fps = 30;
const createFrames = async (ctx, fps, streamWriter, width, height) => {
const getRandomRgb = () => {
var num = Math.round(0xffffff * Math.random());
var r = num >> 16;
var g = num >> 8 & 255;
var b = num & 255;
return 'rgb(' + r + ', ' + g + ', ' + b + ')';
}
const encodedChunks = [];
const videoFrames = [];
const encoderOutput = (encodedChunk) => {
encodedChunks.push(encodedChunk);
}
const encoderError = (err) => {
//console.error(err);
}
const encoder = new VideoEncoder({
output: encoderOutput,
error: encoderError
})
encoder.configure({
//codec: "avc1.64001E",
//avc:{format:"annexb"},
codec: "vp8",
hardwareAcceleration: "prefer-software",
width: width,
height: height,
displayWidth: width,
displayHeight: height,
bitrate: 3_000_000,
framerate: fps,
bitrateMode: "constant",
latencyMode: "quality"
});
const ft = 1 / fps;
const micro = 1_000_000;
const ft_us = Math.floor(ft * micro);
for(let i = 0; i < 10; i++) {
console.log(`Writing frames ${i * fps}-${(i + 1) * fps}`);
ctx.fillStyle = getRandomRgb();
ctx.fillRect(0,0, width, height);
ctx.fillStyle = "white";
ctx.textAlign = "center";
ctx.font = "80px Arial";
ctx.fillText(`${i}`, width / 2, height / 2);
for(let j = 0; j < fps; j++) {
//console.log(`Writing frame ${i}.${j}`);
const offset = i > 0 ? 1 : 0;
const timestamp = i * ft_us * fps + j * ft_us;
const duration = ft_us;
var frameData = ctx.getImageData(0, 0, width, height);
var buffer = frameData.data.buffer;
const frame = new VideoFrame(buffer,
{
format: "RGBA",
codedWidth: width,
codedHeight: height,
colorSpace: {
primaries: "bt709",
transfer: "bt709",
matrix: "bt709",
fullRange: true
},
timestamp: timestamp,
duration: ft_us
});
encoder.encode(frame, { keyFrame: false });
videoFrames.push(frame);
}
}
//return videoFrames;
await encoder.flush();
//return encodedChunks;
const decodedChunks = [];
const decoder = new VideoDecoder({
output: (frame) => {
decodedChunks.push(frame);
},
error: (e) => {
console.log(e.message);
}
});
decoder.configure({
codec: 'vp8',
codedWidth: width,
codedHeight: height
});
encodedChunks.forEach((chunk) => {
decoder.decode(chunk);
});
await decoder.flush();
return decodedChunks;
}
const canvas = new OffscreenCanvas(256, 256);
const ctx = canvas.getContext("2d");
const recordedChunks = [];
const streamTrackGenerator = new MediaStreamTrackGenerator({ kind: 'video' });
const streamWriter = streamTrackGenerator.writable.getWriter();
const mediaStream = new MediaStream();
mediaStream.addTrack(streamTrackGenerator);
const mediaRecorder = new MediaRecorder(mediaStream, {
mimeType: "video/webm",
videoBitsPerSecond: 3_000_000
});
mediaRecorder.addEventListener('dataavailable', (event) => {
recordedChunks.push(event.data);
console.log(event)
});
mediaRecorder.addEventListener('stop', (event) => {
console.log("stopped?")
console.log('Frames written');
console.log('Stopping MediaRecorder');
console.log('Closing StreamWriter');
const blob = new Blob(recordedChunks, {type: mediaRecorder.mimeType});
const url = URL.createObjectURL(blob);
const video = document.createElement('video');
video.src = url;
document.body.appendChild(video);
video.setAttribute('controls', 'true')
video.play().catch(e => alert(e.message))
});
console.log('StreamWrite ready');
console.log('Starting mediarecorder');
console.log('Creating frames');
const chunks = await createFrames(ctx, fps, streamWriter, width, height);
mediaRecorder.start(33333);
for(const key in chunks) {
await streamWriter.ready;
const chunk = chunks[key];
//await new Promise(resolve => setTimeout(resolve, 1))
await streamWriter.write(chunk);
mediaRecorder.requestData();
}
//await streamWriter.ready;
//streamWriter.close();
//mediaRecorder.stop();
/*const mediaSource = new MediaSource();
const video = document.createElement('video');
document.body.appendChild(video);
video.setAttribute('controls', 'true')
const url = URL.createObjectURL(mediaSource);
video.src = url;
mediaSource.addEventListener('sourceopen', function() {
var mediaSource = this;
const sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vp8"');
let allocationSize = 0;
chunks.forEach((c) => { allocationSize += c.byteLength});
var buf = new ArrayBuffer(allocationSize);
chunks.forEach((chunk) => {
chunk.copyTo(buf);
});
sourceBuffer.addEventListener('updateend', function() {
//mediaSource.endOfStream();
video.play();
});
sourceBuffer.appendBuffer(buf);
});*/
//video.play().catch(e => alert(e.message))
/*mediaStream.getTracks()[0].stop();
const blob = new Blob(chunks, { type: "video/webm" });
const url = URL.createObjectURL(blob);
const video = document.createElement('video');
video.srcObject = url;
document.body.appendChild(video);
video.setAttribute('controls', 'true')
video.play().catch(e => alert(e.message))*/
//mediaRecorder.stop();
}
Conclusion / Afterwords
After all that I tried, I had the most problems with turning Frames into Tracks and Tracks into Streams etc. There is so much (poorly documentet) converting from one thing to another and half of it is done with streams, which also lacks a lot of documentation. There doesn't even seem to be any meaningful way to create custom ReadableStreams and WritableStreams without the use of NPM packages.
I never got VideoFrame duration working. What surprised me the most is that basically nothing else in the process mattered with regards to video or frame length other than adjusting the hacky await new Promise(resolve => setTimeout(resolve, 1000)) timing, but even with that, the recording was really inconsistent. If there was any lag during recording, it would show on the recording; I had recordings where some rectangles were shown for half a second and other ones for 2 seconds. Interestingly enough, the whole recording process would sometimes break completely, if I removed the arbitrary setTimeout. A program that would break without the timeout, would work with await new Promise(resolve => setTimeout(resolve, 1)). This is usually a clue that this has something to do with JS Event Loops, as setTimeouts with 0ms timings tell JS to "wait for next event loop round".
I'm still going to work on this a bit, but I'm doubtful I'll make any further progress. I'd like to get this to work without the use of MediaRecorder and by utilizing streams to work out resource issues.
One really interesting thing that I bumped into was that MediaStreamTrackGenerator is actually old news. The w3 documentation only really talks about VideoTrackGenerator and there's an interesting take on how to basically build a VideoTrackGenerator from the existing MediaStreamTrackGenerator. Also note this part specifically:
This interestingily enough tells us that MediaStreamTrackGenerator.clone() === MediaStreamTrack which I tried to put in use, but without success.
Anyway, I hope this might give you some new ideas or clarify some things. Maybe you'll figure out something I didn't. Have a good one and do tell if you have questions or figure something out!
Further reading
w3.org VideoFrame and duration
Edit 1
Forgot to mention that I used OffscreenCanvas and it's context, instead of normal Canvas. As we're also talking about performance here, I figured I'd try and see how OffscreenCanvas works.
I also used the second constructor of VideoFrame, that is, I gave it an ArrayBuffer instead of a bitmap image like in your code.

Although you have an accepted Answer, I'll add my two-cents worth of advice...
"Generating each frame takes some time, let's say 1 second, and I would like to generate the video at 10 fps. If generating each frame takes 1 second, despite indicating timestamp and duration, the video is played at 1frame/sec, not 10fps.
How can I encode the video frames at the desired frame rate?"
To encode a 10 frames-per-sec video, from your For loop of 10 bitmaps, would give you a video with a 1 second of duration (but it travels through 10 frames during that 1 second interval).
What you want then is a new frame every 100ms until these 10 frames makes a 1000ms.
To achieve that 10 FPS, you simply...
First pause the recorder with mediaRecorder.pause();
Now generate your bitmap (this process can take any length of time)
When frame/bitmap is ready, then resume the recorder for 100ms with mediaRecorder.resume();
To achieve the 100ms per frame, you can use a Timer that re-pauses the recording.
Think of it as using a camcorder, where you:
press record -> await 100ms of capture -> pause -> new frame ->repeat press record until 10 frames.
Here is a quick-ish example as a starting point (eg: readers should improve upon it):
<!DOCTYPE html>
<html>
<body>
<button onclick="recorder_Setup()">Create Video</button>
<h2 id="demo"></h2>
<script>
//# Create canvas for dummy frames
const canvas = document.createElement('canvas');
canvas.width = 256;
canvas.height = 256;
document.body.appendChild(canvas);
const ctx = canvas.getContext('2d');
const recordedChunks = [];
var mediaRecorder; var generator; var writer
var stream; var frame; var frameCount = 0;
//# needed to pause the function, whilst recorder stores frames at the specified interval
const sleep = ( sleep_time ) => { return new Promise(resolve => setTimeout(resolve, sleep_time) ) }
//# setup recorder
recorder_Setup(); //# create and start recorder here
function getRandomRgb()
{
var num = Math.round(0xffffff * Math.random());
var r = num >> 16;
var g = num >> 8 & 255;
var b = num & 255;
return 'rgb(' + r + ', ' + g + ', ' + b + ')';
}
function recorder_Setup()
{
//# create media generator track
generator = new MediaStreamTrackGenerator({kind: 'video'});
writer = generator.writable.getWriter();
stream = new MediaStream();
stream.addTrack(generator);
var myObj = {
mimeType: "video/webm",
videoBitsPerSecond: 3_000_000 // 3MBps
};
mediaRecorder = new MediaRecorder( stream, myObj );
mediaRecorder.addEventListener('dataavailable', (event) => { onFrameData( event.data ); } );
mediaRecorder.addEventListener("stop", (event) => { recorder_Stop() } );
//# start the recorder... and start adding frames
mediaRecorder.start();
recorder_addFrame();
}
function onFrameData( input )
{
//console.log( "got frame data... frame count v2 : " + frameCount );
recordedChunks.push( input );
}
async function recorder_addFrame ()
{
mediaRecorder.pause();
await new Promise(resolve => setTimeout(resolve, 1000) )
//# add text for frame number
ctx.fillStyle = "#808080";
ctx.fillRect(0, 0, 256, 256);
ctx.font = "30px Arial"; ctx.fillStyle = "#FFFFFF";
ctx.fillText("frame : " + frameCount ,10,50);
//# add color fill for frame pixels
ctx.fillStyle = getRandomRgb();
ctx.fillRect(0, 70, 256, 180);
const ms = 1000; // 1µs
//# note "timestamp" and "duration" don't mean anything here...
frame = new VideoFrame( await createImageBitmap(canvas), {timestamp: 0, duration: 0} );
console.log( "frame count v1 : " + frameCount );
frameCount++;
//# When ready to write a frame, you resume the recoder for the required interval period
//# (eg: a 10 FPS = 1000/10 = 100 ms interval per frame during the 1000 ms (of 1 second)...
mediaRecorder.resume();
await sleep(100);
writer.write(frame);
frame.close();
if( frameCount >= 10 ) { mediaRecorder.stop(); }
else { recorder_addFrame(); }
}
function recorder_Stop()
{
console.log("recorder stopped");
stream.getTracks().forEach(track => track.stop());
const blob = new Blob(recordedChunks, {type: mediaRecorder.mimeType});
const url = URL.createObjectURL(blob);
const video = document.createElement('video');
video.src = url;
document.body.appendChild(video);
video.setAttribute('controls', 'true')
video.setAttribute('muted', 'true')
//video.play().catch(e => alert(e.message))
}
</script>
</body>
</html>

Related

Is it possible to get raw values of audio data using MediaRecorder()

I'm using MediaRecorder() with getUserMedia() to record audio data from the browser. It works, but recorded data is recorded in the Blob format. I want to get raw audio data (amplitudes), not the Blobs. Is it possible to do it?
My code looks like this:
navigator.mediaDevices.getUserMedia({audio: true, video: false}).then(stream => {
const recorder = new MediaRecorder(stream);
recorder.ondataavailable = e => {
console.log(e.data); // output: Blob { size: 8452, type: "audio/ogg; codecs=opus" }
};
recorder.start(1000); // send data every 1s
}).catch(console.error);

MediaRecorder is useful to create files; if you want to do audio processing, Web Audio would be a better approach. See this HTML5Rocks tutorial which shows how to integrate getUserMedia with Web Audio using createMediaStreamSource from Web Audio.

No need for MediaRecorder. Use web audio to gain access to raw data values, e.g. like this:
navigator.mediaDevices.getUserMedia({audio: true})
.then(spectrum).catch(console.log);
function spectrum(stream) {
const audioCtx = new AudioContext();
const analyser = audioCtx.createAnalyser();
audioCtx.createMediaStreamSource(stream).connect(analyser);
const canvas = div.appendChild(document.createElement("canvas"));
canvas.width = window.innerWidth - 20;
canvas.height = window.innerHeight - 20;
const ctx = canvas.getContext("2d");
const data = new Uint8Array(canvas.width);
ctx.strokeStyle = 'rgb(0, 125, 0)';
setInterval(() => {
ctx.fillStyle = "#a0a0a0";
ctx.fillRect(0, 0, canvas.width, canvas.height);
analyser.getByteFrequencyData(data);
ctx.lineWidth = 2;
let x = 0;
for (let d of data) {
const y = canvas.height - (d / 128) * canvas.height / 4;
const c = Math.floor((x*255)/canvas.width);
ctx.fillStyle = `rgb(${c},0,${255-x})`;
ctx.fillRect(x++, y, 2, canvas.height - y)
}
analyser.getByteTimeDomainData(data);
ctx.lineWidth = 5;
ctx.beginPath();
x = 0;
for (let d of data) {
const y = canvas.height - (d / 128) * canvas.height / 2;
x ? ctx.lineTo(x++, y) : ctx.moveTo(x++, y);
}
ctx.stroke();
}, 1000 * canvas.width / audioCtx.sampleRate);
};

I used the following code to achieve an array of integers into console. As to what they mean, I assume they are amplitudes.
const mediaRecorder = new MediaRecorder(mediaStream);
mediaRecorder.ondataavailable = async (blob: BlobEvent) => console.log(await blob.data.arrayBuffer());
mediaRecorder.start(100);
If you were to want to draw them, then I found this example that I had forked on codepen: https://codepen.io/rhwilburn/pen/vYXggbN which pulses the circle once you click on it.

This is a complex subject because the MediaRecorder API is able to record audio using multiple different container formats, and even further to that, it also supports several audio codecs. However, let's say you tie down your audio recordings to be of a certain codec and container type, for example, if from the front end you force all recordings to use the WebM container format with Opus encoded audio:
new MediaRecorder(stream as MediaStream, {
mimeType: "audio/webm;codecs=opus",
});
If the browser supports this type of recording (the latest Firefox / Edge / Chrome seem to do it for me, but may not work in all browsers and versions thereof, for example I can see that as of right now, Safari only provides partial support for this and therefore it may not work in that browser), then you can write some code to decode the WebM container's Opus encoded chunks, and then decode those chunks into raw PCM audio.
I see that these WebM/Opus containers produced by MediaRecorder in Firefox / Edge / Chrome typically have a SamplingFrequency block and a Channels block at the start, which indicate the sample rate in Hz and the number of channels.
Then, they typically start clustering audio using Cluster blocks, which have a Timestamp block immediately after the Cluster block (but not necessarily guaranteed to be after, as per the WebM specification), followed by Simple blocks which contain binary Opus audio data. Clusters of Simple blocks are ordered by the timestamp of the Cluster block, and the Simple blocks within those clusters are ordered by the timecode header within the Opus audio chunk contained within them, which is stored in a signed two's compliment format.
So, putting all of this together, this seems to work for me in Node.js using Typescript:
import opus = require('#discordjs/opus');
import ebml = require('ts-ebml');
class BlockCluster {
timestamp: number;
encodedChunks: Buffer[] = [];
}
async function webMOpusToPcm(buffer: Buffer): Promise<{ pcm: Buffer, duration: number }> {
const decoder = new ebml.Decoder();
let rate: number;
let channels: number;
let clusters: BlockCluster[] = [];
for await (const element of decoder.decode(buffer)) {
if (element.name === 'SamplingFrequency' && element.type === 'f')
rate = (element as any).value;
if (element.name === 'Channels' && element.type === 'u')
channels = (element as any).value;
if (element.name === 'Cluster' && element.type === 'm')
clusters.push(new BlockCluster());
if (element.name === 'Timestamp' && element.type === 'u')
clusters[clusters.length - 1].timestamp = (element as any).value;
if (element.name === 'SimpleBlock' && element.type === 'b') {
const data: Uint8Array = (element as any).data;
const dataBuffer: Buffer = Buffer.from(data);
clusters[clusters.length - 1].encodedChunks.push(dataBuffer);
}
}
clusters.sort((a, b) => {
return a.timestamp - b.timestamp;
});
let chunks: Uint8Array[] = [];
clusters.forEach(cluster => {
cluster.encodedChunks.sort((a, b) => {
const timecodeA = readTimecode(a);
const timecodeB = readTimecode(b);
return timecodeA - timecodeB;
});
cluster.encodedChunks.forEach(chunk => {
const opusChunk = readOpusChunk(chunk);
chunks.push(opusChunk);
});
});
let pcm: Buffer = Buffer.alloc(0);
const opusDecoder = new opus.OpusEncoder(rate, channels);
chunks.forEach(chunk => {
const opus = Buffer.from(chunk);
const decoded: Buffer = opusDecoder.decode(opus);
pcm = Buffer.concat([pcm, decoded]);
});
const totalSamples = (pcm.byteLength * 8) / (channels * 16);
const duration = totalSamples / rate;
return { pcm, duration };
}
function readOpusChunk(block: Buffer): Buffer {
return block.slice(4);
}
function readTimecode(block: Buffer): number {
const timecode = (block.readUInt8(0) << 16) | (block.readUInt8(1) << 8) | block.readUInt8(2);
return (timecode & 0x800000) ? -(0x1000000 - timecode) : timecode;
}
Note that your input buffer needs to be the WebM/Opus container, and that when playing back the raw PCM audio, you must correctly set the sample rate in Hz and the number of channels must also be set correctly (to the number of channels you read from the WebM container), otherwise the audio will sound distorted. I think that the #discordjs/opus library currently uses a consistent bit depth of 16 to encode raw audio, so you may also have to set the bit depth of the playback to 16.
Furthermore, what works for me today, may not work tomorrow. Libraries may change how they work, browsers may change how they record and containerize, the codecs may change, some browsers may not even support recording using this format, and the format of the containers may change as well. There are a lot of variables, so please take caution.

Efficient method of changing the image displayed on an <img>?

In javascript, I have a reference to a DOM <img> element. I need to change the image displayed by the <img> from within the javascript.
So far, I have tried this by changing the image.src attribute to the URL of the new image. This worked, but there is still a problem: I need the image to be changed many times per second, and changing the src attribute causes the browser to do a new GET request for the image (which is already cached), which puts strain on the web server since there will be hundreds of simultanious clients.
My current (inefficient) solution is done like this:
let image = document.querySelector("img");
setInterval(function(){
image.src = getNextImageURL();//this causes a GET request
}, 10);
function getNextImageURL() {
/*...implementation...*/
}
<img>
I am looking for a more efficient way of changing the image, that does not cause any unnecessary HTTP requests.

I think you need to code in a different way...
if you want to reduce requests you should combine all image in one sprite Sheet.
this trick is used so many in-game animations.
But you will need to change <img/> tag to another tag like <div></div>
the idea here is that we have one image and we just change the viewport for what we want
sample For sprite sheet
let element = document.querySelector("div");
let i = 1
setInterval(function(){
element.style.backgroundPosition= getNextImagePostion(i++);//this will change the posion
}, 100);
function getNextImagePostion(nextPos) {
/*...implementation...*/
// this just for try
// 6 is the number of images in sheet
// 20 is the height for one segment in sheet
var posX = 100*(nextPos%6);
// first is position on x and latter is position on y
return "-"+posX+"px 0px ";
}
.image {
width: 100px; /* width of single image*/
height: 100px; /*height of single image*/
background-image: url(https://picsum.photos/500/100?image=8) /*path to your sprite sheet*/
}
<div class="image">
</div>

If the sprite-sheet idea doesn't work (e.g because you have big images), then use a canvas.
You just need to preload all your images, store them in an Array and then draw them one after the other on your canvas:
const urls = new Array(35).fill(0).map((v, i) =>
'https://picsum.photos/500/500?image=' + i
);
// load all the images
const loadImg = Promise.all(
urls.map(url =>
new Promise((res, rej) => {
// each url will have its own <img>
const img = new Image();
img.onload = e => res(img);
img.onerror = rej;
img.src = url;
})
)
);
// when they're all loaded
loadImg.then(imgs => {
// prepare the canvas
const canvas = document.getElementById('canvas');
// set its size
canvas.width = canvas.height = 500;
// get the drawing context
const ctx = canvas.getContext('2d');
let i = 0;
// start the animation
anim();
function anim() {
// do it again at next screen refresh (~16ms on a 60Hz monitor)
requestAnimationFrame(anim);
// increment our index
i = (i + 1) % imgs.length;
// draw the required image
ctx.drawImage(imgs[i], 0, 0);
}
})
.catch(console.error);
<canvas id="canvas"></canvas>
And if you wish time control:
const urls = new Array(35).fill(0).map((v, i) =>
'https://picsum.photos/500/500?image=' + i
);
// load all the images
const loadImg = Promise.all(
urls.map(url =>
new Promise((res, rej) => {
// each url will have its own <img>
const img = new Image();
img.onload = e => res(img);
img.onerror = rej;
img.src = url;
})
)
);
// when they're all loaded
loadImg.then(imgs => {
// prepare the canvas
const canvas = document.getElementById('canvas');
// set its size
canvas.width = canvas.height = 500;
// get the drawing context
const ctx = canvas.getContext('2d');
const duration = 100; // the number of ms each image should last
let i = 0;
let lastTime = performance.now();
// start the animation
requestAnimationFrame(anim);
// rAF passes a timestamp
function anim(time) {
// do it again at next screen refresh (~16ms on a 60Hz monitor)
requestAnimationFrame(anim);
const timeDiff = time - lastTime;
if(timeDiff < duration) { // duration has not yet elapsed
return;
}
// update lastTime
lastTime = time - (timeDiff - duration);
// increment our index
i = (i + 1) % (imgs.length);
// draw the required image
ctx.drawImage(imgs[i], 0, 0);
}
})
.catch(console.error);
<canvas id="canvas"></canvas>

Web Worker always 25% slower than main thread with expensive operations on Canvas pixels

I made a simple application, which takes photo from video tag and make it gray, available in full here: Canvas WebWorker PoC:
const photoParams = [
0, //x
0, //y
320, //width
240, //height
];
async function startVideo () {
const stream = await navigator.mediaDevices.getUserMedia({
audio: false,
video: true,
});
const video = document.querySelector('#video');
video.srcObject = stream;
video.play();
return video;
}
function takePhoto () {
const video = document.querySelector('#video');
const canvas = document.querySelector('#canvas');
canvas.width = 320;
canvas.height = 240;
const context = canvas.getContext('2d');
context.drawImage(video, ...photoParams);
const imageData = applyFilter({imageData:
context.getImageData(...photoParams)});
context.putImageData(imageData, 0, 0);
return canvas.toDataURL("image/png")
}
function setPhoto () {
const photo = takePhoto();
const image = document.querySelector('#image');
image.src = photo;
}
startVideo();
const button = document.querySelector('#button');
button.addEventListener('click', setPhoto);
In one of functions, I placed long, unnecessary for loop to make it really slow:
function transformPixels ({data}) {
let rawPixels;
const array = new Array(2000);
for (element of array) {
rawPixels = [];
const pixels = getPixels({
data,
});
const filteredPixels = [];
for (const pixel of pixels) {
const average = getAverage(pixel);
filteredPixels.push(new Pixel({
red: average,
green: average,
blue: average,
alpha: pixel.alpha,
}));
}
for (const pixel of filteredPixels) {
rawPixels.push(...pixel.toRawPixels());
}
}
return rawPixels;
};
And I created Web Worker version which, as I thougt, should be faster cause it not break the main thread:
function setPhotoWorker () {
const video = document.querySelector('#video');
const canvas = document.querySelector('#canvas');
canvas.width = 320;
canvas.height = 240;
const context = canvas.getContext('2d');
context.drawImage(video, ...photoParams);
const imageData = context.getImageData(...photoParams);
const worker = new Worker('filter-worker.js');
worker.onmessage = (event) => {
const rawPixelsArray = [...JSON.parse(event.data)];
rawPixelsArray.forEach((element, index) => {
imageData.data[index] = element;
});
context.putImageData(imageData, 0, 0);
const image = document.querySelector('#image');
image.src = canvas.toDataURL("image/png");
}
worker.postMessage(JSON.stringify([...imageData.data]));
}
Which could be run in this way:
button.addEventListener('click', setPhotoWorker);
Worker code is almost exactly the same as single-threaded version, except one thing - to improve messaging performance, string is sent instead of array of numbers:
worker.onmessage = (event) => {
const rawPixelsArray = [...JSON.parse(event.data)];
};
//...
worker.postMessage(JSON.stringify([...imageData.data]));
And inside filter-worker.js:
onmessage = (data) => {
const rawPixels = transformPixels({data: JSON.parse(data.data)});
postMessage(JSON.stringify(rawPixels));
};
The problem is that worker version is always about 20-25% slower than main thread version. First I thought it may be size of message, but in my laptop I have 640 x 480 camera, which gives 307 200 items - which I don't think are expensive enough to be reason why, for 2000 for iterations, leads to results: main thread: about 160 seconds, worker about 200 seconds. You can download app from Github repo and check it on your own. The pattern is quite the same here - worker is always 20-25% slower. Without using JSON API, worker needs something like 220 seconds to finish its job. The only one reason which I thought is that worker thread has very low priority, and in my application, where main thread has not too much things to do it is simply slower - and in real-world app, where main thread might be busier, worker will win. Do you have any ideas why worker is so slow? Thank you for every answer.

Consistent FPS in frame by frame video with <canvas>

I'm trying to display precisely enough a video that I can stop on or jump to a specific frame. For now my approach is to display a video frame by frame on a canvas (I do have the list of images to display, I don't have to extract them from the video). The speed doesn't really matter as long as it's consistent and around 30fps. Compatibility somewhat matters (we can ignore IE≤8).
So first off, I'm pre-loading all the images:
var all_images_loaded = {};
var all_images_src = ["Continuity_0001.png","Continuity_0002.png", ..., "Continuity_0161.png"];
function init() {
for (var i = all_images_src.length - 1; i >= 0; i--) {
var objImage = new Image();
objImage.onload = imagesLoaded;
objImage.src = 'Continuity/'+all_images_src[i];
all_images_loaded[all_images_src[i]] = objImage;
}
}
var loaded_count = 0;
function imagesLoaded () {
console.log(loaded_count + " / " + all_images_src.length);
if(++loaded_count === all_images_src.length) startvid();
}
init();
and once that's done, the function startvid() is called.
Then the first solution I came up with was to draw on requestAnimationFrame() after a setTimeout (to tame the fps):
var canvas = document.getElementsByTagName('canvas')[0];
var ctx = canvas.getContext("2d");
var video_pointer = 0;
function startvid () {
video_pointer++;
if(all_images_src[video_pointer]){
window.requestAnimationFrame((function (video_pointer) {
ctx.drawImage(all_images_loaded[all_images_src[video_pointer]], 0, 0);
}).bind(undefined, video_pointer))
setTimeout(startvid, 33);
}
}
but that felt somewhat slow and irregular...
So second solution is to use 2 canvases and draw on the one being hidden and then switch it to visible with the proper timing:
var canvas = document.getElementsByTagName('canvas');
var ctx = [canvas[0].getContext("2d"), canvas[1].getContext("2d")];
var curr_can_is_0 = true;
var video_pointer = 0;
function startvid () {
video_pointer++;
curr_can_is_0 = !curr_can_is_0;
if(all_images_src[video_pointer]){
ctx[curr_can_is_0?1:0].drawImage(all_images_loaded[all_images_src[video_pointer]], 0, 0);
window.requestAnimationFrame((function (curr_can_is_0, video_pointer) {
ctx[curr_can_is_0?0:1].canvas.style.visibility = "visible";
ctx[curr_can_is_0?1:0].canvas.style.visibility = "hidden";
}).bind(undefined, curr_can_is_0, video_pointer));
setTimeout(startvid, 33);
}
}
but that too feels slow and irregular...
Yet, Google Chrome (which I'm developing on) seems to have plenty of idle time:
So what can I do?

The Problem:
Your main issue is setTimeout and setInterval are not guaranteed to fire at exactly the delay specified, but at some point after the delay.
From the MDN article on setTimeout (emphasis added by me).
delay is the number of milliseconds (thousandths of a second) that the function call should be delayed by. If omitted, it defaults to 0. The actual delay may be longer; see Notes below.
Here are the relevant notes from MDN mentioned above.
Historically browsers implement setTimeout() "clamping": successive setTimeout() calls with delay smaller than the "minimum delay" limit are forced to use at least the minimum delay. The minimum delay, DOM_MIN_TIMEOUT_VALUE, is 4 ms (stored in a preference in Firefox: dom.min_timeout_value), with a DOM_CLAMP_TIMEOUT_NESTING_LEVEL of 5.
In fact, 4ms is specified by the HTML5 spec and is consistent across browsers released in 2010 and onward. Prior to (Firefox 5.0 / Thunderbird 5.0 / SeaMonkey 2.2), the minimum timeout value for nested timeouts was 10 ms.
In addition to "clamping", the timeout can also fire later when the page (or the OS/browser itself) is busy with other tasks.
The Solution:
You would be better off using just requestAnimationFrame, and inside the callback using the timestamp arguments passed to the callback to compute the delta time into the video, and drawing the necessary frame from the list. See working example below. As a bonus, I've even included code to prevent re-drawing the same frame twice.
Working Example:
var start_time = null;
var frame_rate = 30;
var canvas = document.getElementById('video');
var ctx = canvas.getContext('2d');
var all_images_loaded = {};
var all_images_src = (function(frames, fps){//Generate some placeholder images.
var a = [];
var zfill = function(s, l) {
s = '' + s;
while (s.length < l) {
s = '0' + s;
}
return s;
}
for(var i = 0; i < frames; i++) {
a[i] = 'http://placehold.it/480x270&text=' + zfill(Math.floor(i / fps), 2) + '+:+' + zfill(i % fps, 2)
}
return a;
})(161, frame_rate);
var video_duration = (all_images_src.length / frame_rate) * 1000;
function init() {
for (var i = all_images_src.length - 1; i >= 0; i--) {
var objImage = new Image();
objImage.onload = imagesLoaded;
//objImage.src = 'Continuity/'+all_images_src[i];
objImage.src = all_images_src[i];
all_images_loaded[all_images_src[i]] = objImage;
}
}
var loaded_count = 0;
function imagesLoaded () {
//console.log(loaded_count + " / " + all_images_src.length);
if (++loaded_count === all_images_src.length) {
startvid();
}
}
function startvid() {
requestAnimationFrame(draw);
}
var last_frame = null;
function draw(timestamp) {
//Set the start time on the first call.
if (!start_time) {
start_time = timestamp;
}
//Find the current time in the video.
var current_time = (timestamp - start_time);
//Check that it is less than the end of the video.
if (current_time < video_duration) {
//Find the delta of the video completed.
var delta = current_time / video_duration;
//Find the frame for that delta.
var current_frame = Math.floor(all_images_src.length * delta);
//Only draw this frame if it is different from the last one.
if (current_frame !== last_frame) {
ctx.drawImage(all_images_loaded[all_images_src[current_frame]], 0, 0);
last_frame = current_frame;
}
//Continue the animation loop.
requestAnimationFrame(draw);
}
}
init();
<canvas id="video" width="480" height="270"></canvas>

Create a waveform of the full track with Web Audio API

Realtime moving Waveform
I'm currently playing with Web Audio API and made a spectrum using canvas.
function animate(){
var a=new Uint8Array(analyser.frequencyBinCount),
y=new Uint8Array(analyser.frequencyBinCount),b,c,d;
analyser.getByteTimeDomainData(y);
analyser.getByteFrequencyData(a);
b=c=a.length;
d=w/c;
ctx.clearRect(0,0,w,h);
while(b--){
var bh=a[b]+1;
ctx.fillStyle='hsla('+(b/c*240)+','+(y[b]/255*100|0)+'%,50%,1)';
ctx.fillRect(1*b,h-bh,1,bh);
ctx.fillRect(1*b,y[b],1,1);
}
animation=webkitRequestAnimationFrame(animate);
}
Mini question: is there a way to not write 2 times new Uint8Array(analyser.frequencyBinCount)?
DEMO
add a MP3/MP4 file and wait. (tested in Chrome)
http://jsfiddle.net/pc76H/2/
But there are many problems. I can't find a proper documentation of the various audio filters.
Also, if you look at the spectrum you will notice that after 70% or the range there is no data. What does that mean? that maybe from 16k hz to 20k hz is no sound? I would apply a text to the canvas to show the various HZ. but where??
I found out that the returned data is a power of 32 in length with a max of 2048
and the height is always 256.
BUT the real question is ... I want to create a moving waveform like in traktor.
I already did that some time ago with PHP it converts the file to low bitrate than extracts the data and coverts that to a image. i found the script somewhere...but I don't remember where...
note: needs LAME
<?php
$a=$_GET["f"];
if(file_exists($a)){
if(file_exists($a.".png")){
header("Content-Type: image/png");
echo file_get_contents($a.".png");
}else{
$b=3000;$c=300;define("d",3);
ini_set("max_execution_time","30000");
function n($g,$h){
$g=hexdec(bin2hex($g));
$h=hexdec(bin2hex($h));
return($g+($h*256));
};
$k=substr(md5(time()),0,10);
copy(realpath($a),"/var/www/".$k."_o.mp3");
exec("lame /var/www/{$k}_o.mp3 -f -m m -b 16 --resample 8 /var/www/{$k}.mp3 && lame --decode /var/www/{$k}.mp3 /var/www/{$k}.wav");
//system("lame {$k}_o.mp3 -f -m m -b 16 --resample 8 {$k}.mp3 && lame --decode {$k}.mp3 {$k}.wav");
#unlink("/var/www/{$k}_o.mp3");
#unlink("/var/www/{$k}.mp3");
$l="/var/www/{$k}.wav";
$m=fopen($l,"r");
$n[]=fread($m,4);
$n[]=bin2hex(fread($m,4));
$n[]=fread($m,4);
$n[]=fread($m,4);
$n[]=bin2hex(fread($m,4));
$n[]=bin2hex(fread($m,2));
$n[]=bin2hex(fread($m,2));
$n[]=bin2hex(fread($m,4));
$n[]=bin2hex(fread($m,4));
$n[]=bin2hex(fread($m,2));
$n[]=bin2hex(fread($m,2));
$n[]=fread($m,4);
$n[]=bin2hex(fread($m,4));
$o=hexdec(substr($n[10],0,2));
$p=$o/8;
$q=hexdec(substr($n[6],0,2));
if($q==2){$r=40;}else{$r=80;};
while(!feof($m)){
$t=array();
for($i=0;$i<$p;$i++){
$t[$i]=fgetc($m);
};
switch($p){
case 1:$s[]=n($t[0],$t[1]);break;
case 2:if(ord($t[1])&128){$u=0;}else{$u=128;};$u=chr((ord($t[1])&127)+$u);$s[]= floor(n($t[0],$u)/256);break;
};
fread($m,$r);
};
fclose($m);
unlink("/var/www/{$k}.wav");
$x=imagecreatetruecolor(sizeof($s)/d,$c);
imagealphablending($x,false);
imagesavealpha($x,true);
$y=imagecolorallocatealpha($x,255,255,255,127);
imagefilledrectangle($x,0,0,sizeof($s)/d,$c,$y);
for($d=0;$d<sizeof($s);$d+=d){
$v=(int)($s[$d]/255*$c);
imageline($x,$d/d,0+($c-$v),$d/d,$c-($c-$v),imagecolorallocate($x,255,0,255));
};
$z=imagecreatetruecolor($b,$c);
imagealphablending($z,false);
imagesavealpha($z,true);
imagefilledrectangle($z,0,0,$b,$c,$y);
imagecopyresampled($z,$x,0,0,0,0,$b,$c,sizeof($s)/d,$c);
imagepng($z,realpath($a).".png");
header("Content-Type: image/png");
imagepng($z);
imagedestroy($z);
};
}else{
echo $a;
};
?>
The script works... but you are limited to a max image size of 4k pixels.
so you have not a nice waveform if it should rappresent only some milliseconds.
What do i need to store/create a realtime waveform like the traktors app or this php script? btw the traktor has also a colored waveform(the php script not).
EDIT
I rewrote your script that it fits my idea... it's relatively fast.
As you can see inside the function createArray i push the various lines into an object with the key as x coordinate.
I'm simply taking the the highest number.
here is where we could play with the colors.
var ajaxB,AC,B,LC,op,x,y,ARRAY={},W=1024,H=256;
var aMax=Math.max.apply.bind(Math.max, Math);
function error(a){
console.log(a);
};
function createDrawing(){
console.log('drawingArray');
var C=document.createElement('canvas');
C.width=W;
C.height=H;
document.body.appendChild(C);
var context=C.getContext('2d');
context.save();
context.strokeStyle='#121';
context.globalCompositeOperation='lighter';
L2=W*1;
while(L2--){
context.beginPath();
context.moveTo(L2,0);
context.lineTo(L2+1,ARRAY[L2]);
context.stroke();
}
context.restore();
};
function createArray(a){
console.log('creatingArray');
B=a;
LC=B.getChannelData(0);// Float32Array describing left channel
L=LC.length;
op=W/L;
for(var i=0;i<L;i++){
x=W*i/L|0;
y=LC[i]*H/2;
if(ARRAY[x]){
ARRAY[x].push(y)
}else{
!ARRAY[x-1]||(ARRAY[x-1]=aMax(ARRAY[x-1]));
// the above line contains an array of values
// which could be converted to a color
// or just simply create a gradient
// based on avg max min (frequency???) whatever
ARRAY[x]=[y]
}
};
createDrawing();
};
function decode(){
console.log('decodingMusic');
AC=new webkitAudioContext
AC.decodeAudioData(this.response,createArray,error);
};
function loadMusic(url){
console.log('loadingMusic');
ajaxB=new XMLHttpRequest;
ajaxB.open('GET',url);
ajaxB.responseType='arraybuffer';
ajaxB.onload=decode;
ajaxB.send();
}
loadMusic('AudioOrVideo.mp4');

Ok, so what i would do is to load the sound with an XMLHttpRequest, then decode it using webaudio, then display it 'carefully' to have the colors you are searching for.
I just made a quick version, copy-pasting from various of my projects, it is quite working, as you might see with this picture :
The issue is that it is slow as hell. To have (more) decent speed, you'll have to do some computation to reduce the number of lines to draw on the canvas, because at 441000 Hz, you very quickly get too many lines to draw.
// AUDIO CONTEXT
window.AudioContext = window.AudioContext || window.webkitAudioContext ;
if (!AudioContext) alert('This site cannot be run in your Browser. Try a recent Chrome or Firefox. ');
var audioContext = new AudioContext();
var currentBuffer = null;
// CANVAS
var canvasWidth = 512, canvasHeight = 120 ;
var newCanvas = createCanvas (canvasWidth, canvasHeight);
var context = null;
window.onload = appendCanvas;
function appendCanvas() { document.body.appendChild(newCanvas);
context = newCanvas.getContext('2d'); }
// MUSIC LOADER + DECODE
function loadMusic(url) {
var req = new XMLHttpRequest();
req.open( "GET", url, true );
req.responseType = "arraybuffer";
req.onreadystatechange = function (e) {
if (req.readyState == 4) {
if(req.status == 200)
audioContext.decodeAudioData(req.response,
function(buffer) {
currentBuffer = buffer;
displayBuffer(buffer);
}, onDecodeError);
else
alert('error during the load.Wrong url or cross origin issue');
}
} ;
req.send();
}
function onDecodeError() { alert('error while decoding your file.'); }
// MUSIC DISPLAY
function displayBuffer(buff /* is an AudioBuffer */) {
var leftChannel = buff.getChannelData(0); // Float32Array describing left channel
var lineOpacity = canvasWidth / leftChannel.length ;
context.save();
context.fillStyle = '#222' ;
context.fillRect(0,0,canvasWidth,canvasHeight );
context.strokeStyle = '#121';
context.globalCompositeOperation = 'lighter';
context.translate(0,canvasHeight / 2);
context.globalAlpha = 0.06 ; // lineOpacity ;
for (var i=0; i< leftChannel.length; i++) {
// on which line do we get ?
var x = Math.floor ( canvasWidth * i / leftChannel.length ) ;
var y = leftChannel[i] * canvasHeight / 2 ;
context.beginPath();
context.moveTo( x , 0 );
context.lineTo( x+1, y );
context.stroke();
}
context.restore();
console.log('done');
}
function createCanvas ( w, h ) {
var newCanvas = document.createElement('canvas');
newCanvas.width = w; newCanvas.height = h;
return newCanvas;
};
loadMusic('could_be_better.mp3');
Edit : The issue here is that we have too much data to draw. Take a 3 minutes mp3, you'll have 3*60*44100 = about 8.000.000 line to draw. On a display that has, say, 1024 px resolution, that makes 8.000 lines per pixel...
In the code above, the canvas is doing the 'resampling', by drawing lines with low-opacity and in 'ligther' composition mode (e.g. pixel's r,g,b will add-up).
To speed-up things, you have to re-sample by yourself, but to get some colors, it's not just a down-sampling, you'll have to handle a set (within a performance array most probably) of 'buckets', one for each horizontal pixel (so, say 1024), and in every bucket you compute the cumulated sound pressure, the variance, min, max and then, at display time, you decide how you will render that with colors.
For instance :
values between 0 positiveMin are very clear. (any sample is below that point).
values between positiveMin and positiveAverage - variance are darker,
values between positiveAverage - variance and positiveAverage + variance are darker,
and values between positiveAverage+variance and positiveMax lighter .
(same for negative values)
That makes 5 colors for each bucket, and it's still quite some work, for you to code and for the browser to compute.
I don't know if the performance could get decent with this, but i fear the statistics accuracy and the color coding of the software you mention can't be reached on a browser (obviously not in real-time), and that you'll have to make some compromises.
Edit 2 :
I tried to get some colors out of stats but it quite failed. My guess, now, is that the guys at tracktor also change color depending on frequency.... quite some work here....
Anyway, just for the record, the code for an average / mean variation follows.
(variance was too low, i had to use mean variation).
// MUSIC DISPLAY
function displayBuffer2(buff /* is an AudioBuffer */) {
var leftChannel = buff.getChannelData(0); // Float32Array describing left channel
// we 'resample' with cumul, count, variance
// Offset 0 : PositiveCumul 1: PositiveCount 2: PositiveVariance
// 3 : NegativeCumul 4: NegativeCount 5: NegativeVariance
// that makes 6 data per bucket
var resampled = new Float64Array(canvasWidth * 6 );
var i=0, j=0, buckIndex = 0;
var min=1e3, max=-1e3;
var thisValue=0, res=0;
var sampleCount = leftChannel.length;
// first pass for mean
for (i=0; i<sampleCount; i++) {
// in which bucket do we fall ?
buckIndex = 0 | ( canvasWidth * i / sampleCount );
buckIndex *= 6;
// positive or negative ?
thisValue = leftChannel[i];
if (thisValue>0) {
resampled[buckIndex ] += thisValue;
resampled[buckIndex + 1] +=1;
} else if (thisValue<0) {
resampled[buckIndex + 3] += thisValue;
resampled[buckIndex + 4] +=1;
}
if (thisValue<min) min=thisValue;
if (thisValue>max) max = thisValue;
}
// compute mean now
for (i=0, j=0; i<canvasWidth; i++, j+=6) {
if (resampled[j+1] != 0) {
resampled[j] /= resampled[j+1]; ;
}
if (resampled[j+4]!= 0) {
resampled[j+3] /= resampled[j+4];
}
}
// second pass for mean variation ( variance is too low)
for (i=0; i<leftChannel.length; i++) {
// in which bucket do we fall ?
buckIndex = 0 | (canvasWidth * i / leftChannel.length );
buckIndex *= 6;
// positive or negative ?
thisValue = leftChannel[i];
if (thisValue>0) {
resampled[buckIndex + 2] += Math.abs( resampled[buckIndex] - thisValue );
} else if (thisValue<0) {
resampled[buckIndex + 5] += Math.abs( resampled[buckIndex + 3] - thisValue );
}
}
// compute mean variation/variance now
for (i=0, j=0; i<canvasWidth; i++, j+=6) {
if (resampled[j+1]) resampled[j+2] /= resampled[j+1];
if (resampled[j+4]) resampled[j+5] /= resampled[j+4];
}
context.save();
context.fillStyle = '#000' ;
context.fillRect(0,0,canvasWidth,canvasHeight );
context.translate(0.5,canvasHeight / 2);
context.scale(1, 200);
for (var i=0; i< canvasWidth; i++) {
j=i*6;
// draw from positiveAvg - variance to negativeAvg - variance
context.strokeStyle = '#F00';
context.beginPath();
context.moveTo( i , (resampled[j] - resampled[j+2] ));
context.lineTo( i , (resampled[j +3] + resampled[j+5] ) );
context.stroke();
// draw from positiveAvg - variance to positiveAvg + variance
context.strokeStyle = '#FFF';
context.beginPath();
context.moveTo( i , (resampled[j] - resampled[j+2] ));
context.lineTo( i , (resampled[j] + resampled[j+2] ) );
context.stroke();
// draw from negativeAvg + variance to negativeAvg - variance
// context.strokeStyle = '#FFF';
context.beginPath();
context.moveTo( i , (resampled[j+3] + resampled[j+5] ));
context.lineTo( i , (resampled[j+3] - resampled[j+5] ) );
context.stroke();
}
context.restore();
console.log('done 231 iyi');
}

Based on the top answer, I have controlled that by reducing number of lines want to draw and little canvas function call placement. see following code for your reference.
// AUDIO CONTEXT
window.AudioContext = (window.AudioContext ||
window.webkitAudioContext ||
window.mozAudioContext ||
window.oAudioContext ||
window.msAudioContext);
if (!AudioContext) alert('This site cannot be run in your Browser. Try a recent Chrome or Firefox. ');
var audioContext = new AudioContext();
var currentBuffer = null;
// CANVAS
var canvasWidth = window.innerWidth, canvasHeight = 120 ;
var newCanvas = createCanvas (canvasWidth, canvasHeight);
var context = null;
window.onload = appendCanvas;
function appendCanvas() { document.body.appendChild(newCanvas);
context = newCanvas.getContext('2d'); }
// MUSIC LOADER + DECODE
function loadMusic(url) {
var req = new XMLHttpRequest();
req.open( "GET", url, true );
req.responseType = "arraybuffer";
req.onreadystatechange = function (e) {
if (req.readyState == 4) {
if(req.status == 200)
audioContext.decodeAudioData(req.response,
function(buffer) {
currentBuffer = buffer;
displayBuffer(buffer);
}, onDecodeError);
else
alert('error during the load.Wrong url or cross origin issue');
}
} ;
req.send();
}
function onDecodeError() { alert('error while decoding your file.'); }
// MUSIC DISPLAY
function displayBuffer(buff /* is an AudioBuffer */) {
var drawLines = 500;
var leftChannel = buff.getChannelData(0); // Float32Array describing left channel
var lineOpacity = canvasWidth / leftChannel.length ;
context.save();
context.fillStyle = '#080808' ;
context.fillRect(0,0,canvasWidth,canvasHeight );
context.strokeStyle = '#46a0ba';
context.globalCompositeOperation = 'lighter';
context.translate(0,canvasHeight / 2);
//context.globalAlpha = 0.6 ; // lineOpacity ;
context.lineWidth=1;
var totallength = leftChannel.length;
var eachBlock = Math.floor(totallength / drawLines);
var lineGap = (canvasWidth/drawLines);
context.beginPath();
for(var i=0;i<=drawLines;i++){
var audioBuffKey = Math.floor(eachBlock * i);
var x = i*lineGap;
var y = leftChannel[audioBuffKey] * canvasHeight / 2;
context.moveTo( x, y );
context.lineTo( x, (y*-1) );
}
context.stroke();
context.restore();
}
function createCanvas ( w, h ) {
var newCanvas = document.createElement('canvas');
newCanvas.width = w; newCanvas.height = h;
return newCanvas;
};
loadMusic('could_be_better.mp3');

this is a bit old, sorry to bump, but it's the only post about displaying a full waveform with the Web Audio Api and I'd like to share what method i used.
This method is not perfect but it only goes through the displayed audio and it only goes over it once. it also succeeds in displaying an actual waveform for short files or big zoom :
and a convincing loudness chart for bigger files dezoomed :
here is what it's like at middle zoom, kind of pleasant too:
notice that both zooms use the same algorythm.
I still struggle about scales (the zoomed waveform is bigger than the dezoomed one (though not so bigger than displayed on the images)
this algorythm i find is quite efficient (i can change zoom on 4mn music and it redraws flawlessly every 0.1s)
function drawWaveform (audioBuffer, canvas, pos = 0.5, zoom = 1) {
const canvasCtx = canvas.getContext("2d")
const width = canvas.clientWidth
const height = canvas.clientHeight
canvasCtx.clearRect(0, 0, width, height)
canvasCtx.fillStyle = "rgb(255, 0, 0)"
// calculate displayed part of audio
// and slice audio buffer to only process that part
const bufferLength = audioBuffer.length
const zoomLength = bufferLength / zoom
const start = Math.max(0, bufferLength * pos - zoomLength / 2)
const end = Math.min(bufferLength, start + zoomLength)
const rawAudioData = audioBuffer.getChannelData(0).slice(start, end)
// process chunks corresponding to 1 pixel width
const chunkSize = Math.max(1, Math.floor(rawAudioData.length / width))
const values = []
for (let x = 0; x < width; x++) {
const start = x*chunkSize
const end = start + chunkSize
const chunk = rawAudioData.slice(start, end)
// calculate the total positive and negative area
let positive = 0
let negative = 0
chunk.forEach(val =>
val > 0 && (positive += val) || val < 0 && (negative += val)
)
// make it mean (this part makes dezommed audio smaller, needs improvement)
negative /= chunk.length
positive /= chunk.length
// calculate amplitude of the wave
chunkAmp = -(negative - positive)
// draw the bar corresponding to this pixel
canvasCtx.fillRect(
x,
height / 2 - positive * height,
1,
Math.max(1, chunkAmp * height)
)
}
}
To use it :
async function decodeAndDisplayAudio (audioData) {
const source = audioCtx.createBufferSource()
source.buffer = await audioCtx.decodeAudioData(audioData)
drawWaveform(source.buffer, canvas, 0.5, 1)
// change position (0//start -> 0.5//middle -> 1//end)
// and zoom (0.5//full -> 400//zoomed) as you wish
}
// audioData comes raw from the file (server send it in my case)
decodeAndDisplayAudio(audioData)

We Keep Coding

JavaScript is the programming language of the Web.

MediaRecorder ignoring VideoFrame.timestamp - javascript

Related

Is it possible to get raw values of audio data using MediaRecorder()

Efficient method of changing the image displayed on an <img>?

Web Worker always 25% slower than main thread with expensive operations on Canvas pixels

Consistent FPS in frame by frame video with <canvas>

Create a waveform of the full track with Web Audio API

Categories

Resources