I'm trying to create a 1ch (mono) MediaStreamTrack with a MediaStreamAudioDestinationNode. According to the standard, this should be possible.
const ctx = new AudioContext();
const destinationNode = new MediaStreamAudioDestinationNode(ctx, {
channelCount: 1,
channelCountMode: 'explicit',
channelInterpretation: 'speakers',
});
await ctx.resume(); // doesn't make a difference
// this fails
expect(destinationNode.stream.getAudioTracks()[0].getSettings().channelCount).equal(1);
Result:
Chrome 92.0.4515.107 always creates a stereo track
Firefox 90 returns nothing for destinationNode.stream.getAudioTracks()[0].getSettings() even though getSettings() should be fully supported
What am I doing wrong here?
Edit:
Apparently both firefox and chrome actually produce a mono track, they just don't tell you the truth. Here's a workaround solution for Typescript:
async function getNumChannelsInTrack(track: MediaStreamTrack): Promise<number> {
// unfortunately, we can't use track.getSettings().channelCount, because
// - Firefox 90 returns {} from getSettings() => see: https://bugzilla.mozilla.org/show_bug.cgi?id=1307808
// - Chrome 92 always reports 2 channels, even if that's incorrect => see: https://bugs.chromium.org/p/chromium/issues/detail?id=1044645
// Workaround: Record audio and look at the recorded buffer to determine the number of audio channels in the buffer.
const stream = new MediaStream();
stream.addTrack(track);
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.start();
return new Promise<number>((resolve) => {
setTimeout(() => {
mediaRecorder.stop();
mediaRecorder.ondataavailable = async ({ data }) => {
const offlineAudioContext = new OfflineAudioContext({
length: 1,
sampleRate: 48000,
});
const audioBuffer = await offlineAudioContext.decodeAudioData(
await data.arrayBuffer()
);
resolve(audioBuffer.numberOfChannels);
};
}, 1000);
});
}
I don't think you're doing anything wrong. It's a known issue (https://bugs.chromium.org/p/chromium/issues/detail?id=1044645) in Chrome which just didn't get fixed so far.
I think in Firefox it isn't even implemented. This bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1307808) indicates that getSettings() only returns those values that can be changed so far.
I think it would be helpful if you star/follow theses issues or comment on them to make sure they don't get forgotten about.
Related
I've been trying out the web serial API in chrome (https://web.dev/serial/) to do some basic communication with an Arduino board. I've noticed quite a substantial delay when reading data from the serial port however. This same issue is present in some demos, but not all.
For instance, using the WebSerial demo linked towards the bottom has a near instantaneous read:
While using the Serial Terminal example results in a read delay. (note the write is triggered at the moment of a character being entered on the keyboard):
WebSerial being open source allows for me to check for differences between my own implementation, however I am seeing performance much like the second example.
As for the relevant code:
this.port = await navigator.serial.requestPort({ filters });
await this.port.open({ baudRate: 115200, bufferSize: 255, dataBits: 8, flowControl: 'none', parity: 'none', stopBits: 1 });
this.open = true;
this.monitor();
private monitor = async () => {
const dataEndFlag = new Uint8Array([4, 3]);
while (this.open && this.port?.readable) {
this.open = true;
const reader = this.port.readable.getReader();
try {
let data: Uint8Array = new Uint8Array([]);
while (this.open) {
const { value, done } = await reader.read();
if (done) {
this.open = false;
break;
}
if (value) {
data = Uint8Array.of(...data, ...value);
}
if (data.slice(-2).every((val, idx) => val === dataEndFlag[idx])) {
const decoded = this.decoder.decode(data);
this.messages.push(decoded);
data = new Uint8Array([]);
}
}
} catch {
}
}
}
public write = async (data: string) => {
if (this.port?.writable) {
const writer = this.port.writable.getWriter();
await writer.write(this.encoder.encode(data));
writer.releaseLock();
}
}
The equivalent WebSerial code can be found here, this is pretty much an exact replica. From what I can observe, it seems to hang at await reader.read(); for a brief period of time.
This is occurring both on a Windows 10 device and a macOS Monterey device. The specific hardware device is an Arduino Pro Micro connected to a USB port.
Has anyone experienced this same scenario?
Update: I did some additional testing with more verbose logging. It seems that the time between the write and read is exactly 1 second every time.
the delay may result from SerialEvent() in your arduino script: set Serial.setTimeout(1);
This means 1 millisecond instead of default 1000 milliseconds.
I have been trying to set up webcam recording in Google Colab. Through adapting code found here: Is there any way to capture live video using webcam in google colab? I was able to get some video, however, if I try to record for more than 60 seconds the environment crashes.
I've tried a few rough workarounds:
Recording numerous smaller videos and concatenating them. This led to gaps in the footage.
Calling the 'wait' function a few time... 30 seconds, then another 30 seconds ect. This still led to crashing with anything over 60 seconds.
This work has led me to believe that the problem isn't a timeout, it's caused by the video buffer size being exceeded by the incoming video stream.
I'm new to javascript and realize this isn't really what Colab is made for, but would appreciate any help - even if it's just letting me know that this isn't a feasible thing to do! My code can be found below, thanks in advance!
# Run the function, get the video path as saved in your notebook, and play it back here.
from IPython.display import HTML, display, Javascript
from base64 import b64encode, b64decode
from google.colab.output import eval_js
def record_video_timed(filename='video.mp4'):
js = Javascript("""
async function wait(ms) {
return new Promise(resolve => {
setTimeout(resolve, ms);
});
}
async function recordVideo() {
// mashes together the advanced_outputs.ipynb function provided by Colab,
// a bunch of stuff from Stack overflow, and some sample code from:
// https://developer.mozilla.org/en-US/docs/Web/API/MediaStream_Recording_API
// Optional frames per second argument.
const options = { mimeType: "video/webm; codecs=vp9" };
const stream = await navigator.mediaDevices.getUserMedia({video: true});
let recorder = new MediaRecorder(stream, options);
recorder.start();
await wait(10000);//less than 60 or it crashes
recorder.stop();
let recData = await new Promise((resolve) => recorder.ondataavailable = resolve);
let arrBuff = await recData.data.arrayBuffer();
let binaryString = "";
let bytes = new Uint8Array(arrBuff);
bytes.forEach((byte) => {
binaryString += String.fromCharCode(byte);
})
return btoa(binaryString);
}
""")
try:
display(js)
data = eval_js('recordVideo({})')
binary = b64decode(data)
with open(filename, "wb") as video_file:
video_file.write(binary)
print(
f"Finished recording video. Saved binary under filename in current working directory: {filename}"
)
except Exception as err:
# In case any exceptions arise
print(str(err))
return filename
#######
video_path = record_video_timed()
I'd like to be able to end a Google speech-to-text stream (created with streamingRecognize), and get back the pending SR (speech recognition) results.
In a nutshell, the relevant Node.js code:
// create SR stream
const stream = speechClient.streamingRecognize(request);
// observe data event
const dataPromise = new Promise(resolve => stream.on('data', resolve));
// observe error event
const errorPromise = new Promise((resolve, reject) => stream.on('error', reject));
// observe finish event
const finishPromise = new Promise(resolve => stream.on('finish', resolve));
// send the audio
stream.write(audioChunk);
// for testing purposes only, give the SR stream 2 seconds to absorb the audio
await new Promise(resolve => setTimeout(resolve, 2000));
// end the SR stream gracefully, by observing the completion callback
const endPromise = util.promisify(callback => stream.end(callback))();
// a 5 seconds test timeout
const timeoutPromise = new Promise(resolve => setTimeout(resolve, 5000));
// finishPromise wins the race here
await Promise.race([
dataPromise, errorPromise, finishPromise, endPromise, timeoutPromise]);
// endPromise wins the race here
await Promise.race([
dataPromise, errorPromise, endPromise, timeoutPromise]);
// timeoutPromise wins the race here
await Promise.race([dataPromise, errorPromise, timeoutPromise]);
// I don't see any data or error events, dataPromise and errorPromise don't get settled
What I experience is that the SR stream ends successfully, but I don't get any data events or error events. Neither dataPromise nor errorPromise gets resolved or rejected.
How can I signal the end of my audio, close the SR stream and still get the pending SR results?
I need to stick with streamingRecognize API because the audio I'm streaming is real-time, even though it may stop suddenly.
To clarify, it works as long as I keep streaming the audio, I do receive the real-time SR results. However, when I send the final audio chunk and end the stream like above, I don't get the final results I'd expect otherwise.
To get the final results, I actually have to keep streaming silence for several more seconds, which may increase the ST bill. I feel like there must be a better way to get them.
Updated: so it appears, the only proper time to end a streamingRecognize stream is upon data event where StreamingRecognitionResult.is_final is true. As well, it appears we're expected to keep streaming audio until data event is fired, to get any result at all, final or interim.
This looks like a bug to me, filing an issue.
Updated: it now seems to have been confirmed as a bug. Until it's fixed, I'm looking for a potential workaround.
Updated: for future references, here is the list of the current and previously tracked issues involving streamingRecognize.
I'd expect this to be a common problem for those who use streamingRecognize, surprised it hasn't been reported before. Submitting it as a bug to issuetracker.google.com, as well.
My bad — unsurprisingly, this turned to be an obscure race condition in my code.
I've put together a self-contained sample that works as expected (gist). It helped me tracking down the issue. Hopefully, it may help others and my future self:
// A simple streamingRecognize workflow,
// tested with Node v15.0.1, by #noseratio
import fs from 'fs';
import path from "path";
import url from 'url';
import util from "util";
import timers from 'timers/promises';
import speech from '#google-cloud/speech';
export {}
// need a 16-bit, 16KHz raw PCM audio
const filename = path.join(path.dirname(url.fileURLToPath(import.meta.url)), "sample.raw");
const encoding = 'LINEAR16';
const sampleRateHertz = 16000;
const languageCode = 'en-US';
const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
},
interimResults: false // If you want interim results, set this to true
};
// init SpeechClient
const client = new speech.v1p1beta1.SpeechClient();
await client.initialize();
// Stream the audio to the Google Cloud Speech API
const stream = client.streamingRecognize(request);
// log all data
stream.on('data', data => {
const result = data.results[0];
console.log(`SR results, final: ${result.isFinal}, text: ${result.alternatives[0].transcript}`);
});
// log all errors
stream.on('error', error => {
console.warn(`SR error: ${error.message}`);
});
// observe data event
const dataPromise = new Promise(resolve => stream.once('data', resolve));
// observe error event
const errorPromise = new Promise((resolve, reject) => stream.once('error', reject));
// observe finish event
const finishPromise = new Promise(resolve => stream.once('finish', resolve));
// observe close event
const closePromise = new Promise(resolve => stream.once('close', resolve));
// we could just pipe it:
// fs.createReadStream(filename).pipe(stream);
// but we want to simulate the web socket data
// read RAW audio as Buffer
const data = await fs.promises.readFile(filename, null);
// simulate multiple audio chunks
console.log("Writting...");
const chunkSize = 4096;
for (let i = 0; i < data.length; i += chunkSize) {
stream.write(data.slice(i, i + chunkSize));
await timers.setTimeout(50);
}
console.log("Done writing.");
console.log("Before ending...");
await util.promisify(c => stream.end(c))();
console.log("After ending.");
// race for events
await Promise.race([
errorPromise.catch(() => console.log("error")),
dataPromise.then(() => console.log("data")),
closePromise.then(() => console.log("close")),
finishPromise.then(() => console.log("finish"))
]);
console.log("Destroying...");
stream.destroy();
console.log("Final timeout...");
await timers.setTimeout(1000);
console.log("Exiting.");
The output:
Writting...
Done writing.
Before ending...
SR results, final: true, text: this is a test I'm testing voice recognition This Is the End
After ending.
data
finish
Destroying...
Final timeout...
close
Exiting.
To test it, a 16-bit/16KHz raw PCM audio file is required. An arbitrary WAV file wouldn't work as is because it contains a header with metadata.
This: "I'm looking for a potential workaround." - have you considered extending from SpeechClient as a base class? I don't have credential to test, but you can extend from SpeechClient with your own class and then call the internal close() method as needed. The close() method shuts down the SpeechClient and resolves the outstanding Promise.
Alternatively you could also Proxy the SpeechClient() and intercept/respond as needed. But since your intent is to shut it down, the below option might be your workaround.
const speech = require('#google-cloud/speech');
class ClientProxy extends speech.SpeechClient {
constructor() {
super();
}
myCustomFunction() {
this.close();
}
}
const clientProxy = new ClientProxy();
try {
clientProxy.myCustomFunction();
} catch (err) {
console.log("myCustomFunction generated error: ", err);
}
Since it's a bug, I don't know if this is suitable for you but I have used this.recognizeStream.end(); several times in different situations and it worked. However, my code was a bit different...
This feed may be something for you:
https://groups.google.com/g/cloud-speech-discuss/c/lPaTGmEcZQk/m/Kl4fbHK2BQAJ
I'm working on a project that utilizes WebRTC for file transfers, recently someone reported an issue saying that transfers end prematurely for bigger files. I've found the problem, and my solution to that problem was to rely on the bufferedamountlow event to coordinate the sending of chunks. I've also stopped closing the connection when the sender thinks it's complete.
For some reason, though, in Safari that event does not fire.
Here is the relevant code:
const connection = new RTCPeerConnection(rtcConfiguration);
const channel = connection.createDataChannel('sendDataChannel');
channel.binaryType = 'arraybuffer';
channel.addEventListener('open', () => {
const fileReader = new FileReader();
let offset = 0;
const nextSlice = (currentOffset: number) => {
// Do asynchronous thing with FileReader, that will result in
// channel.send(buffer) getting called.
// Also, offset gets increased by 16384 (the size of the buffer).
};
channel.bufferedAmountLowThreshold = 0;
channel.addEventListener('bufferedamountlow', () => nextSlice(offset));
nextSlice(0);
});
The longer version of my code is available here.
While researching the issue, I've realized that on Safari, my connection.stcp is undefined. (Since I've switched to connection.sctp.maxMessageSize instead of 16384 for my buffer size.) I would assume the problem is related to that.
What could be the cause for this problem? Let me add that on Chrome and Firefox everything works just fine without any issues whatsoever.
The bufferedamountlow event is not required for the proper function of my code, I would like for it to work, though, to get more precise estimates of current progress and speed on the sending end of the file transfer.
After some investigation, it comes to me that Safari has issues with 0 as a value for the bufferedAmountLowThreshold property.
When set to a non-zero value, the code functions properly.
Checking the bufferedAmount inside of the nextSlice function also increases the speed at which the chunks are being sent:
const bufferSize = connection.sctp?.maxMessageSize || 65535;
channel.addEventListener('open', () => {
const fileReader = new FileReader();
let offset = 0;
const nextSlice = (currentOffset: number) => {
const slice = file.slice(offset, currentOffset + bufferSize);
fileReader.readAsArrayBuffer(slice);
};
fileReader.addEventListener('load', e => {
const buffer = e.target.result as ArrayBuffer;
try {
channel.send(buffer);
} catch {
// Deal with failure...
}
offset += buffer.byteLength;
if (channel.bufferedAmount < bufferSize / 2) {
nextSlice(offset);
}
});
channel.bufferedAmountLowThreshold = bufferSize / 2;
channel.addEventListener('bufferedamountlow', () => nextSlice(offset));
nextSlice(0);
});
We are currently trying to use the native BarcodeDetector in the latest Chrome (59). It is available under the enabled flag chrome://flags/#enable-experimental-web-platform-features.
You can have look at another example here.
We are checking for the native BarcodeDetector like this:
typeof window.BarcodeDetector === 'function'
But even when we fall into this branch and we finally manage to pour some image data into the detector we only get an exception:
DOMException: Barcode detection service unavailable.
I've googled that, but was not very successful. The most promising hint is this one, but it seems to be an odd Webkit fork.
What we are doing is the following (pseudocode!):
window.createImageBitmap(canvasContext.canvas) // from canvasEl.getContext('2d')
.then(function(data)
{
window.BarcodeDetector.detect(data);
});
// go on: catch the exception
Has anybody ever heard of this and can share some experiences with the BarcodeDetector?
I've had the same error using the desktop version of Chrome. I guess at the moment it is only implemented in the mobile version.
Here is an example that worked for me (Android 5.0.2, Chrome 59 with chrome://flags/#enable-experimental-web-platform-features enabled): https://jsfiddle.net/daniilkovalev/341u3qxz/
navigator.mediaDevices.enumerateDevices().then((devices) => {
let id = devices.filter((device) => device.kind === "videoinput").slice(-1).pop().deviceId;
let constrains = {video: {optional: [{sourceId: id }]}};
navigator.mediaDevices.getUserMedia(constrains).then((stream) => {
let capturer = new ImageCapture(stream.getVideoTracks()[0]);
step(capturer);
});
});
function step(capturer) {
capturer.grabFrame().then((bitmap) => {
let canvas = document.getElementById("canvas");
let ctx = canvas.getContext("2d");
ctx.drawImage(bitmap, 0, 0, bitmap.width, bitmap.height, 0, 0, canvas.width, canvas.height);
var barcodeDetector = new BarcodeDetector();
barcodeDetector.detect(bitmap)
.then(barcodes => {
document.getElementById("barcodes").innerHTML = barcodes.map(barcode => barcode.rawValue).join(', ');
step(capturer);
})
.catch((e) => {
console.error(e);
});
});
}
Reviving an old thread here.
It is not supported in desktop browsers, only mobile browsers.
Here's my working code:
getImage(event) {
let file: File = event.target.files[0];
let myReader: FileReader = new FileReader();
let barcode = window['BarcodeDetector'];
let pdf = new barcode(["pdf417"]);
createImageBitmap(file)
.then(img => pdf.detect(img))
.then(resp => {
alert(resp[0].rawValue);
});
}
It took some back and forth, but we finally have reliable feature detection for this API, see the article for full details. This is the relevant code snippet:
await BarcodeDetector.getSupportedFormats();
/* On a macOS computer logs
[
"aztec",
"code_128",
"code_39",
"code_93",
"data_matrix",
"ean_13",
"ean_8",
"itf",
"pdf417",
"qr_code",
"upc_e"
]
*/
This allows you to detect the specific feature you need, for example, QR code scanning:
if (('BarcodeDetector' in window) &&
((await BarcodeDetector.getSupportedFormats()).includes('qr_code'))) {
console.log('QR code scanning is supported.');
}