I've built a demo of a voice-assistant that takes microphone data, passes it to an analyzer, then uses .getByteFrequencyData() to show visuals. It works as follows:
Press mic button to connect to microphone input
Release mic button disconnects microphone stream, and plays MP3 of response.
When MP3 ends: return to standby, and wait for new button press to start step 1. again.
Live version here: https://dyadstudios.com/playground/daysi/
The way I've achieved this is as follows:
var audioContext = (window.AudioContext) ? new AudioContext() : new window["webkitAudioContext"]();
var analyser = audioContext.createAnalyser();
analyser.fftSize = Math.pow(2, 9); // 512
var sourceMic = undefined; // Microphone stream source
var sourceMp3 = undefined; // MP3 buffer source
// Browser requests mic access
window.navigator.mediaDevices.getUserMedia({audio: true}).then((stream) => {
sourceMic = audioContext.createMediaStreamSource(stream)
})
// 1. Mic button pressed, start listening
listen() {
audioContext.resume();
// Connect mic to analyser
if (sourceMic) {
sourceMic.connect(analyser);
}
}
// 2. Disconnect mic, play mp3
answer(mp3AudioBuffer) {
if (sourceMic) {
// Disconnect mic to prevent audio feedback
sourceMic.disconnect();
}
// Play mp3
sourceMp3 = audioContext.createBufferSource();
sourceMp3.onended = mp3StreamEnded;
sourceMp3.buffer = mp3AudioBuffer;
sourceMp3.connect(analyser);
sourceMp3.start(0);
// Connect to speakers to hear MP3
analyser.connect(audioContext.destination);
}
// 3. MP3 has ended
mp3StreamEnded() {
sourceMp3.disconnect();
// Disconnect speakers (prevents mic feedback)
analyser.disconnect();
}
It works perfectly well on Firefox and Chrome, but OSX Safari 12.1 only gets microphone data the first time I press the button. Whenever I press the mic button on a second pass, the analyzer no longer gets microphone data, but MP3 data still works. It seems like connecting, disconnecting, and re-connecting the mic's AudioNode to the analyzer breaks it somehow. I checked and Safari supports AudioNode.connect() as well as AudioNode.disconnect(). I know Safari's WebAudio implementation is a bit outdated, is there a workaround to fix this issue?
There is indeed a bug in Safari which causes it to drop the signal if a MediaStreamAudioSourceNode is disconnected for some time. You can avoid this by just not disconnecting it as long as you might need it again. You can use a GainNode instead to mute the signal.
You could do this by introducing a new variable to control the volume.
const sourceMicVolume = audioContext.createGain();
sourceMicVolume.gain.value = 0;
Then you need to connect everything right away when you instantiate the sourceMic.
sourceMic = audioContext.createMediaStreamSource(stream);
sourceMic.connect(sourceMicVolume);
sourceMicVolume.connect(analyser);
Inside your event handlers you would then only set the volume of the gain instead of (dis)connecting the nodes. Inside the listen() function that would look like this:
if (sourceMic) {
sourceMicVolume.gain.value = 1;
}
And inside the answer() function it would look like this:
if (sourceMic) {
sourceMicVolume.gain.value = 0;
}
Related
Our website records audio and plays it back for a user. It has worked for years with many different devices, but it started failing on the iPhone 14. I created a test app at https://nmp-recording-test.netlify.app/ so I can see what is going on. It works perfectly on all devices but it only works the first time on an iPhone 14. It works on other iPhones and it works on iPad and MacBooks using Safari or any other browser.
It looks like it will record if that is the first audio you ever do. If I get an AudioContext somewhere else the audio playback will work for that, but then the recording won't.
The only symptom I can see is that it doesn't call MediaRecorder.ondataavailable when it is not working, but I assume that is because it isn't recording.
Here is the pattern that I'm seeing with my test site:
Click "new recording". (the level indicator moves, the data available callback is triggered)
Click "listen" I hear what I just did
Click "new recording". (no levels move, no data is reported)
Click "listen" nothing is played.
But if I do anything, like click the metronome on and off then it won't record the FIRST time, either.
The "O.G. Recording" is the original way I was doing the recording, using deprecated method createMediaStreamSource() and createScriptProcessor()/createJavaScriptNode(). I thought maybe iPhone finally got rid of that, so I created the MediaRecorder version.
What I'm doing, basically, is (truncated to show the important part):
const chunks = []
function onSuccess(stream: MediaStream) {
mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = function (e) {
chunks.push(e.data);
}
mediaRecorder.start(1000);
}
navigator.mediaDevices.getUserMedia({ audio: true }).then(onSuccess, onError);
Has anyone else seen anything different in the way the iPhone 14 handles recording?
Does anyone have a suggestion about how to debug this?
If you have an iPhone 14, would you try my test program above and let me know if you get the same results? We only have one iPhone 14 to test with, and maybe there is something weird about that device.
If it works you should see a number of lines something like data {"len":6784} appear every second when you are recording.
--- EDIT ---
I reworked the code similar to Frank zeng's suggestion and I am getting it to record, but it is still not right. The volume is really low, it looks like there are some dropouts, and there is a really long pause when resuming the AudioContext.
The new code seems to work perfectly in the other devices and browsers I have access to.
--- EDIT 2 ---
There were two problems - one is that the deprecated use of createScriptProcessor stopped working but the second one was an iOS bug that was fixed in version 16.2. So rewriting to use the AudioWorklet was needed, but keeping the recording going once it is started is not needed.
I have the same problem as you,I think the API of AudioContent.createScriptProcessor is Invalid in Iphone14, I used new API About AudioWorkletNode to replace it. And don't closed the stream, Because the second recording session of iPhone 14 is too laggy, Remember to destroy the data after recording. After testing, I have solved this problem,Here's my code,
// get stream
window.navigator.mediaDevices.getUserMedia(options).then(async (stream) => {
// that.stream = stream
that.context = new AudioContext()
await that.context.resume()
const rate = that.context.sampleRate || 44100
that.mp3Encoder = new lamejs.Mp3Encoder(1, rate, 128)
that.mediaSource = that.context.createMediaStreamSource(stream)
// API开始逐步淘汰了,如果可用则继续用,如果不可用则采用worklet方案写入音频数据
if (that.context.createScriptProcessor && typeof that.context.createScriptProcessor === 'function') {
that.mediaProcessor = that.context.createScriptProcessor(0, 1, 1)
that.mediaProcessor.onaudioprocess = event => {
window.postMessage({ cmd: 'encode', buf: event.inputBuffer.getChannelData(0) }, '*')
that._decode(event.inputBuffer.getChannelData(0))
}
} else { // 采用新方案
that.mediaProcessor = await that.initWorklet()
}
resolve()
})
// content of audioworklet function
async initWorklet() {
try {
/*音频流数据分析节点*/
let audioWorkletNode;
/*---------------加载AudioWorkletProcessor模块并将其添加到当前的Worklet----------------------------*/
await this.context.audioWorklet.addModule('/get-voice-node.js');
/*---------------AudioWorkletNode绑定加载后的AudioWorkletProcessor---------------------------------*/
audioWorkletNode = new AudioWorkletNode(this.context, "get-voice-node");
/*-------------AudioWorkletNode和AudioWorkletProcessor通信使用MessagePort--------------------------*/
console.log('audioWorkletNode', audioWorkletNode)
const messagePort = audioWorkletNode.port;
messagePort.onmessage = (e) => {
let channelData = e.data[0];
window.postMessage({ cmd: 'encode', buf: channelData }, '*')
this._decode(channelData)
}
return audioWorkletNode;
} catch (e) {
console.log(e)
}
}
// content of get-voice-node.js, Remember to put it in the static resource directory
class GetVoiceNode extends AudioWorkletProcessor {
/*
* options由new AudioWorkletNode()时传递
* */
constructor() {
super()
}
/*
* `inputList`和outputList`都是输入或输出的数组
* 比较坑的是只有128个样本???如何设置
* */
process (inputList, outputList, parameters) {
// console.log(inputList)
if(inputList.length>0&&inputList[0].length>0){
this.port.postMessage(inputList[0]);
}
return true //回来让系统知道我们仍处于活动状态并准备处理音频。
}
}
registerProcessor('get-voice-node', GetVoiceNode)
Destroy the recording instance and free the memory,if want use it the nextTime,you have better create new instance
this.recorder.stop()
this.audioDurationTimer && window.clearInterval(this.audioDurationTimer)
const audioBlob = this.recorder.getMp3Blob()
// Destroy the recording instance and free the memory
this.recorder = null
I'm using javascript getUserMedia() to get access to the users microphone and record the audio with recorder.js
Everything is working fine except that the sound level is very low on mobile (Tested safari and chrome on IOS) but perfect on desktop (Chrome, FF, Safari).
I have tried to adjust the gain with gainNode = audioContext.createGain(); and that will effect the level from 0.0 (no sound) to 1.0 (normal sound) or higher (distorted sound). Problem is that 1.0 i perfect on desktop but very very low on mobile. If i go to e.g. gain=25 then the volume is much higher but also very distorted and therefore not useable.
Is it possible to get good and quality sound level on IOS and how?
Here is my script so fare:
var constraints = { audio: true, video:false }
navigator.mediaDevices.getUserMedia(constraints).then(function(stream) {
//Get mic input
audioContext = new AudioContext();
gumStream = stream;
input = audioContext.createMediaStreamSource(stream);
//Set gain level
gainNode = audioContext.createGain();
gainNode.gain.value = 1.0;
input.connect(gainNode);
//Handle recording
rec = new Recorder(gainNode);
rec.record();
//Audio visualizer
analyser = audioContext.createAnalyser();
freqs = new Uint8Array(analyser.frequencyBinCount);
input.connect(analyser);
requestAnimationFrame(visualize);
}).catch(function(err) {
});
UPDATE:
After a lot of research I found that the recording it self is fine. The problem is when i play the recorded audio without refreshing the browser the sound is very low. If i refresh the browser the sound is perfect.
What is really weird is if i write an alert() in my stopRecording() function, the sound plays perfect even without refreshing the browser.
Testet in IOS - Safari, could this be a IOS bug?
function stopRecording() {
rec.stop();
gumStream.getTracks().forEach(function(track) {
if (track.readyState == 'live' && track.kind === 'audio'){
track.stop();
}
});
alert('Recording is complete');
rec.exportWAV(handleRecording);
}
It's a bit like safari don't release or end the getUserMedia() and as long as that is 'on' the audio.play(); has low sound. Maybe the alert() changes browser focus and therefore it works(?)
I would really like to avoid having an alert there but don't know how this can be fixed.
I'm running succesfully a client web page that act as a voice message sender, using MediaRecorder APIs:
when the user press any key, start an audio recording,
when the key is released, the audio recording is sent, via soketio, to a server for further processing.
This is a sort of PTT (Push To Talk) user experience, where the user has just to press a key (push) to activate the voice recording. And afterward he must release the key to stop the recording, triggering the message send to the server.
Here a javascript code chunk I used:
navigator.mediaDevices
.getUserMedia({ audio: true })
.then(stream => {
const mediaRecorder = new MediaRecorder(stream)
var audioChunks = []
//
// start and stop recording:
// keyboard (any key) events
//
document
.addEventListener('keydown', () => mediaRecorder.start())
document
.addEventListener('keyup', () => mediaRecorder.stop())
//
// add data chunk to mediarecorder
//
mediaRecorder
.addEventListener('dataavailable', event => {
audioChunks.push(event.data)
})
//
// mediarecorder event stop
// trigger socketio audio message emission.
//
mediaRecorder
.addEventListener('stop', () => {
socket.emit('audioMessage', audioChunks)
audioChunks = []
})
})
Now, What I want is to activate/deactivate the audio(speech) recording not only from a web page button/key/touch, but from an external hardware microphone (with a Push-To-Talk button). More precisely, I want to interface an industrial headset with PTT button on the ear dome, see the photo:
BTW, the PTT button is just a physical button that act as short-circuit toggle switch, as in the photo, just as an example:
By default the microphone is grounded and input signal == 0
When the PTT button is pressed, the micro is activated and input signal != 0.
Now my question is: how can I use Web Audio API to possibly detect when the PTT button is pressed (so audio signal is > 0) to do a mediaRecorder.start() ?
reading here: I guess I have to use the stream returned by mediaDevices.getUserMedia and create an AudioContext() processor:
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(handleSuccess);
const handleSuccess = function(stream) {
const context = new AudioContext();
const source = context.createMediaStreamSource(stream);
const processor = context.createScriptProcessor(1024, 1, 1);
source.connect(processor);
processor.connect(context.destination);
processor.onaudioprocess = function(e) {
// Do something with the data,
console.log(e.inputBuffer);
};
};
But what the processor.onaudioprocess function must do to start (volume > DELTA) and stop (volume < DELTA) the MediaRecorder?
I guess the volume detection could be useful for two situation:
With PTT button, where the user explicitly decide the duration of the speech, pressing and releasing the button
Without the PTT button, in this case the voice message is created with the so called VOX mode (continous audio processing)
Any idea?
I answer to my question just to share a solution I found.
The #cwilso old project: volume-meter seems to be the precise implementation of what #scott-stensland stated in comment above. See the demo: https://webaudiodemos.appspot.com/volume-meter/
UPDATE
BTW, using #cwilso project and #scott-stensland suggestion, I implemented a WeBAD opensource project to solve also my original question:
https://github.com/solyarisoftware/WeBAD
A simple usage of the Web Audio API:
var UnprefixedAudioContext = window.AudioContext || window.webkitAudioContext;
var context;
var volumeNode;
var soundBuffer;
context = new UnprefixedAudioContext();
volumeNode = context.createGain();
volumeNode.connect(context.destination);
volumeNode.gain.value = 1;
context.decodeAudioData(base64ToArrayBuffer(getTapWarm()), function (decodedAudioData) {
soundBuffer = decodedAudioData;
});
function play(buffer) {
var source = context.createBufferSource();
source.buffer = buffer;
source.connect(volumeNode);
(source.start || source.noteOn).call(source, 0);
};
function playClick() {
play(soundBuffer);
}
inside a UIWebView works fine (plays the sound); but when you switch to the Music app and play a song, and then come back to the app with the UIWebView the song stops playing.
The same code inside Safari doesn't have this problem.
Is there a workaround to avoid this behavior?
Here's the full fiddle:
http://jsfiddle.net/gabrielmaldi/4Lvdyhpx/
Are you on iOS? This sounds like an audio session category issue to me. iOS apps define how their audio interacts with audio. From Apple's documentation:
Each audio session category specifies a particular pattern of “yes”
and “no” for each of the following behaviors, as detailed in Table
B-1:
Interrupts non-mixable apps audio: If yes, non-mixable apps will be
interrupted when your app activates its audio session.
Silenced by the Silent switch: If yes, your audio is silenced when the
user moves the Silent switch to silent. (On iPhone, this switch is
called the Ring/Silent switch.)
Supports audio input: If yes, app audio input (recording), is allowed.
Supports audio output: If yes, app audio output (playback), is
allowed.
Looks like the default category silences audio from other apps:
AVAudioSessionCategorySoloAmbient—(Default) Playback only. Silences
audio when the user switches the Ring/Silent switch to the “silent”
position and when the screen locks. This category differs from the
AVAudioSessionCategoryAmbient category only in that it interrupts
other audio.
The key here is in the last sentence: "it interrupts other audio".
There are a number of other categories you can use depending on whether or not you want your audio silenced when the screen is locked, etc. AVAudioSessionCategoryAmbient does not silence audio.
Give this a try in the objective-c portion of your app:
NSError *setCategoryError = nil;
BOOL success = [[AVAudioSession sharedInstance]
setCategory: AVAudioSessionCategoryAmbient
error: &setCategoryError];
if (!success) { /* handle the error in setCategoryError */ }
I've been experimenting with connecting an audio element to the web audio api using createMediaElementSource and got it to work but one thing I need to do is change the playback rate of the audio tag and I couldn't get that to work.
If you try to run the code below, you'll see that it works until you uncomment the line where we set the playback rate. When this line is in the audio gets muted.
I know I can set the playback rate on an AudioBufferSourceNode using source.playbackRate.value but this is not what I'd like to do, I need to set the playback rate on the audio element while it's connected to the web audio api using createMediaElementSource so I don't have any AudioBufferSourceNode.
Has anyone managed to do that?
var _source,
_audio,
_context,
_gainNode;
_context = new webkitAudioContext();
function play(url) {
if (_audio) {
_audio.pause();
}
_audio = new Audio(url);
//_audio.playbackRate = 0.6;
setTimeout(function() {
if (!_gainNode) {
_gainNode = _context.createGainNode();
_gainNode.gain.value = 0.1;
_gainNode.connect(_context.destination);
}
_source = _context.createMediaElementSource(_audio);
_source.connect(_gainNode);
_audio.play();
}, 0);
}
play("http://geo-samples.beatport.com/items/volumes/volume2/items/3000000/200000/40000/9000/400/60/3249465.LOFI.mp3");
setTimeout(function () {
_audio.pause();
}, 4000);
You have to set the playback rate after the audio has started playing. The only portable way I have found to make this work, is by waiting until you get a timeupdate event with valid currentTime:
_audio.addEventListener('timeupdate', function(){
_if(!isNaN(audio.currentTime)) {
_audio.playbackRate = 0.6;
}
});
Note that playback rate isn't currently supported on android and that Chrome (on desktop) doesn't support playback rates lower than 0.5.
Which browser are you using to test this? It seems this is not yet implemented in Firefox, but should be working on Chrome.
Mozilla bug for implementing playbackRate:
https://bugzilla.mozilla.org/show_bug.cgi?id=495040