Live streaming audio with WebRTC and WebAudio (WebSockets) - javascript

I'm trying to set up a live audio streaming system where a client will broadcast the audio from his microphone (accessed with getUserMedia) to one or more peers.
To do so, chunks of the audio stream are sent through a WebSocket to a server, which will then relay this information to all the peers connected to the WebSocket.
My main problem comes from how to play chunks of data recieved by the peers on a website.
First, that's how I send the chunks of audio data on my client broadcasting JS script :
var context = new AudioContext();
var audioStream = context.createMediaStreamSource(stream);
// Create a processor node of buffer size, with one input channel, and one output channel
var node = context.createScriptProcessor(2048, 1, 1);
// listen to the audio data, and record into the buffer
node.onaudioprocess = function(e){
var inputData = e.inputBuffer.getChannelData(0);
ws.send(JSON.stringify({sound: _arrayBufferToBase64(convertoFloat32ToInt16(inputData))}));
}
audioStream.connect(node);
node.connect(context.destination);
arrayBufferToBase64 and convertoFloat32ToInt16 are methods that I use to send respectively the stream in base64 format, and to convert the inputData to Int16, instead of that fancy Float32 representation (I used methods found on SO, supposed to work).
Then, after the data has gone through the WebSocket, I collect the data in another script, which will be executed on the website of each peer :
var audioCtx = new AudioContext();
var arrayBuffer = _base64ToArrayBuffer(mediaJSON.sound);
audioCtx.decodeAudioData(arrayBuffer, function(buffer) {
playSound(buffer);
});
I also need to convert the base64 data recieved to an ArrayBuffer, which will then be decoded by decodedAudioData to produce an audioBuffer of type AudioBuffer. The playSound function is as simple as this :
function playSound(arrBuff) {
var src = audioCtx.createBufferSource();
src.buffer = arrBuff;
src.looping = false;
src.connect(audioCtx.destination);
src.start();
}
But for some reasons, I can't get any sound to play on this script. I'm pretty sure the broadcasting script is correct, but not the "listener" script. Can anyone help me on this ?
Thanks !

Related

JavaScript play arraybuffer as audio. Need help to solve "decodeaudiodata unable to decode audio data"

I have a .net core WebSocket server that receives live stream audio from client A, then I need to stream this live audio to client B (Browser). So I've received the byte array from client A, and I sent the byte array to client B (Browser) *The byte array is correct as I can convert it into .wav and play it without a problem.
In client B (Browser), I try to decode the array buffer into the audio buffer so it can be put into output and play.
The mediastreamhandler.SendArraySegToAllAsync is where I start to send out the byte array from the server to client B. I use to send to all method 1st, later will be modified and send out data by matching websocket connection ID.
private async Task Echo(HttpContext context, WebSocket webSocket)
{
Debug.WriteLine("Start Echo between Websocket server & client");
var buffer = new byte[1024 * 4];
WebSocketReceiveResult result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
while (!result.CloseStatus.HasValue)
{
await webSocket.SendAsync(new ArraySegment<byte>(buffer, 0, result.Count), result.MessageType, result.EndOfMessage, CancellationToken.None);
result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
await mediastreamhandler.SendArraySegToAllAsync(new ArraySegment<byte>(buffer, 0, result.Count));
}
Debug.WriteLine("Close Echo");
await webSocket.CloseAsync(result.CloseStatus.Value, result.CloseStatusDescription, CancellationToken.None);
}
I then receive the audio byte array through websocket.onmessage in Javascript. Then I pass the byte array to decode and play. But here it says "unable to decode data". While in Mozilla, it did says that content format was unknown (Do I need to reformat the byte array that I receive?) The byte array itself is fine because I've used the same byte to create .wav file locally and play it without any problem.
var ctx = new AudioContext();
function playSound(arrBuff) {
var myAudioBuffer;
var src = ctx.createBufferSource();
ctx.decodeAudioData(arrBuff, function (buffer) {
myAudioBuffer = buffer;
});
src.buffer = myAudioBuffer;
src.connect(ctx.destination);
src.start();
}
I then try another method to decode and play the audio, this time, it has played out some whitenoise sounds instead of streaming out the audio from Client A.
var ctx = new AudioContext();
function playSound(arrBuff) {
var myAudioBuffer;
var src = ctx.createBufferSource();
myAudioBuffer = ctx.createBuffer(1, arrBuff.byteLength, 8000);
var nowBuffering = myAudioBuffer.getChannelData(0);
for (var i = 0; i < arrBuff.byteLength; i++) {
nowBuffering[i] = arrBuff[i];
}
src.buffer = myAudioBuffer;
src.connect(ctx.destination);
src.start();
}
I think I really need some help over here guys, trying to play out the array buffer in weeks and still, couldn't have any breakthrough. Stuck here. Not sure what I've done, could you guys kindly guide me or tell me any other approach to this? Thanks in advance very much, really mean it.
decodeAudioData() requires complete files, so it can't be used to decode partial chunks of data as they are received from a websocket. If you can stream Opus audio files over your websocket, you can play back with an available WebAssembly decoder. See:
https://fetch-stream-audio.anthum.com/
https://github.com/AnthumChris/fetch-stream-audio
I've solved the issue months ago, just here to post my solution.
Steps:
Server receive payload string from Twilio
Send payload string from server to client (browser).
public async Task SendMessageAsync(WebSocket socket, string message)
{
if (socket.State != WebSocketState.Open)
return;
await socket.SendAsync(buffer: new ArraySegment<byte>(array: Encoding.ASCII.GetBytes(message),
offset: 0,
count: message.Length),
messageType: WebSocketMessageType.Text,
endOfMessage: true,
cancellationToken: CancellationToken.None);
}
Add wavheader to payload string at the client side before play out the audio.
function playSound(payloadBase64) {
/* You can generate the wav header here --> https://codepen.io/mxfh/pen/mWLMrJ */
var Base64wavheader = "UklGRgAAAABXQVZFZm10IBIAAAAHAAEAQB8AAEAfAAABAAgAAABmYWN0BAAAAAAAAABkYXRh";
var audio = new Audio('data:audio/wav;base64,' + Base64wavheader + payloadBase64);
audio.play();
};

How to create a live media stream with Javascript

I am wanting to create a live audio stream from one device to a node server which can then broadcast that live feed to several front ends.
I have searched extensively for this and have really hit a wall so hoping somebody out there can help.
I am able to get my audio input from the window.navigator.getUserMedia API.
getAudioInput(){
const constraints = {
video: false,
audio: {deviceId: this.state.deviceId ? {exact: this.state.deviceId} : undefined},
};
window.navigator.getUserMedia(
constraints,
this.initializeRecorder,
this.handleError
);
}
This then passes the stream to the initializeRecorder function which utilises the AudioContext API to create a createMediaStreamSource`
initializeRecorder = (stream) => {
const audioContext = window.AudioContext;
const context = new audioContext();
const audioInput = context.createMediaStreamSource(stream);
const bufferSize = 2048;
// create a javascript node
const recorder = context.createScriptProcessor(bufferSize, 1, 1);
// specify the processing function
recorder.onaudioprocess = this.recorderProcess;
// connect stream to our recorder
audioInput.connect(recorder);
// connect our recorder to the previous destination
recorder.connect(context.destination);
}
In my recorderProcess function, I now have an AudioProcessingEvent object which I can stream.
Currently I am emitting the audio event as as a stream via a socket connection like so:
recorderProcess = (e) => {
const left = e.inputBuffer.getChannelData(0);
this.socket.emit('stream', this.convertFloat32ToInt16(left))
}
Is this the best or only way to do this? Is there a better way by using fs.createReadStream and then posting the an endpoint via Axios? As far as I can tell this will only work with a file as opposed to a continuous live stream?
Server
I have a very simple socket server running ontop of express. Currently I listen for the stream event and then emit that same input back out:
io.on('connection', (client) => {
client.on('stream', (stream) => {
client.emit('stream', stream)
});
});
Not sure how scalable this is but if you have a better suggestion, I'm very open to it.
Client
Now this is where I am really stuck:
On my client I am listening for the stream event and want to listen to the stream as audio output in my browser. I have a function that receives the event but am stuck as to how I can use the arrayBuffer object that is being returned.
retrieveAudioStream = () => {
this.socket.on('stream', (buffer) => {
// ... how can I listen to the buffer as audio
})
}
Is the way I am streaming audio the best / only way I can upload to the node server?
How can I listen to the arrayBuffer object that is being returned on my client side?
Is the way I am streaming audio the best / only way I can upload to the node server?
Not really the best but i have seen worse, its not the only way either using websockets its considered ok from point of view since you want things to be "live" and not keep sending http post request every 5sec.
How can I listen to the arrayBuffer object that is being returned on my client side?
You can try this BaseAudioContext.decodeAudioData to listen to data streamed, the example is pretty simple.
From the code snippets you provide i assume you want to build something from scratch to learn how things work.
In that case, you can try MediaStream Recording API along with an websocket server that sends the chunks to X clients so they can reproduce the audio, etc.
It would make sense to invest time into WebRTC API, to learn how to stream from client to another client.
Also take a look at the links below for some useful information.
(stackoverflow) Get live streaming audio from NodeJS server to clients
(github) video-conference-webrtc
twitch.tv tech stack article
rtc.io

How to play Base64 encoded audio file from client side (Google Cloud TTS)

I'm trying to play the audio output of Google Cloud TTS on a given browser. I can successfully save the TTS output as a wav file, but what I want to do is play the byte array from the client side. Right now when I play my audio byte array, all I get is loud static.
According to the google cloud documentation, I need to convert the base64 encoded text to binary before audio can be played (https://cloud.google.com/text-to-speech/docs/base64-decoding), so I've done that below:
For converting base64 to binary, I referred to: Python converting from base64 to binary
from google.cloud import texttospeech
import base64
def synthesize_text(text):
"""Synthesizes speech from the input string of text."""
client = texttospeech.TextToSpeechClient()
input_text = texttospeech.types.SynthesisInput(text=text)
# Note: the voice can also be specified by name.
# Names of voices can be retrieved with client.list_voices().
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.LINEAR16)
response = client.synthesize_speech(input_text, voice, audio_config)
print(type(response.audio_content))
# The response's audio_content is binary.
audio = response.audio_content
decoded = base64.decodebytes(audio)
decoded_audio = "".join(["{:08b}".format(x) for x in decoded])
with open('static/playback.wav', 'wb') as out:
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
return decoded_audio
I've passed the "decoded_audio" binary audio data through a flask_socketio connection, which then goes to my javascript like so:
socket.on('audio', function(msg) {
playWave(msg);
})
And then I'm trying to play the audio through the function playWave (which I got from this: Play wav file as bytes received from server
function playWave(byteArray) {
console.log(byteArray.length)
var audioCtx = new (window.AudioContext || window.webkitAudioContext)();
var myAudioBuffer = audioCtx.createBuffer(1, byteArray.length, 8000);
var nowBuffering = myAudioBuffer.getChannelData(0);
for (var i = 0; i < byteArray.length; i++) {
nowBuffering[i] = byteArray[i];
}
var source = audioCtx.createBufferSource();
source.buffer = myAudioBuffer;
source.connect(audioCtx.destination);
source.start();
}
I'm not really sure why the only audio output I'm getting is loud static. I'm not sure if I'm decoding the Base64 encoded text incorrectly (I'm converting it to LINEAR16, which should be for wav), and then I'm converting it to a binary byte array.
Or I'm not sure if my sampling rate or playWave function isn't right. Does anyone have experience with how to play Base64 encoded audio from the client browser side?

Synchronize audios with HTML5 and Javascript

I want to join two audios into one to synchronize with HTML5 on the client side. I've seen it with Web Audio API can do many things, but I have not been able to find how.
I have the link to two audio files (.mp3, .wav ...), what I want is to synchronize these two audio files, like a voice and a song. I do not want them together one after another, want to sync.
I would do it all on the client side using HTML5, without need to use the server. Is this possible to do?
Thank you so much for your help.
So I understand it, you have two audio files which you want to render together on the client. The web audio API can do this for you quite easily entirely in JavaScript. A good place to start is http://www.html5rocks.com/en/tutorials/webaudio/intro/
An example script would be
var context = new(window.AudioContext || window.webkitAudioContext) // Create an audio context
// Create an XML HTTP Request to collect your audio files
// https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest
var xhr1 = new XMLHttpRequest();
var xhr2 = new XMLHttpRequest();
var audio_buffer_1, audio_buffer_2;
xhr1.open("GET","your_url_to_audio_1");
xhr1.responseType = 'arraybuffer';
xhr1.onload = function() {
// Decode the audio data
context.decodeAudioData(request.response, function(buffer) {
audio_buffer_1 = buffer;
}, function(error){});
};
xhr2.open("GET","your_url_to_audio_2");
xhr2.responseType = 'arraybuffer';
xhr2.onload = function() {
// Decode the audio data
context.decodeAudioData(request.response, function(buffer) {
audio_buffer_2 = buffer;
}, function(error){});
};
xhr1.send();
xhr2.send();
These would load into global variables audio_buffer_1 and audio_buffer_2 the Web Audio API buffer nodes (https://webaudio.github.io/web-audio-api/#AudioBuffer) of your two files.
Now to create a new audio buffer, you would need to use offline audio context
// Assumes both buffers are of the same length. If not you need to modify the 2nd argument below
var offlineContext = new OfflineAudioContext(context.destination.channelCount,audio_buffer_1.duration*context.sampleRate , context.sampleRate);
var summing = offlineContext.createGain();
summing.connect(offlineContext.destination);
// Build the two buffer source nodes and attach their buffers
var buffer_1 = offlineContext.createBufferSource();
var buffer_2 = offlineContext.createBufferSource();
buffer_1.buffer = audio_buffer_1;
buffer_2.buffer = audio_buffer_2;
// Do something with the result by adding a callback
offlineContext.oncomplete = function(renderedBuffer){
// Place code here
};
//Begin the summing
buffer_1.start(0);
buffer_2.start(0);
offlineContext.startRendering();
Once done you will receive a new buffer inside the callback function called renderedBuffer which will be the direct summation of the two buffers.

Record Audio Stream from getUserMedia

In recent days, I tried to use javascript to record audio stream.
I found that there is no example code which works.
Is there any browser supporting?
Here is my code
navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia || navigator.msGetUserMedia;
navigator.getUserMedia({ audio: true }, gotStream, null);
function gotStream(stream) {
msgStream = stream;
msgStreamRecorder = stream.record(); // no method record :(
}
getUserMedia gives you access to the device, but it is up to you to record the audio. To do that, you'll want to 'listen' to the device, building a buffer of the data. Then when you stop listening to the device, you can format that data as a WAV file (or any other format). Once formatted you can upload it to your server, S3, or play it directly in the browser.
To listen to the data in a way that is useful for building your buffer, you will need a ScriptProcessorNode. A ScriptProcessorNode basically sits between the input (microphone) and the output (speakers), and gives you a chance to manipulate the audio data as it streams. Unfortunately the implementation is not straightforward.
You'll need:
getUserMedia to access the device
AudioContext to create a MediaStreamAudioSourceNode and a ScriptProcessorNode
MediaStreamAudioSourceNode to represent the audio stream
ScriptProcessorNode to get access to the streaming audio data via an onaudioprocessevent. The event exposes the channel data that you'll build your buffer with.
Putting it all together:
navigator.getUserMedia({audio: true},
function(stream) {
// create the MediaStreamAudioSourceNode
var context = new AudioContext();
var source = context.createMediaStreamSource(stream);
var recLength = 0,
recBuffersL = [],
recBuffersR = [];
// create a ScriptProcessorNode
if(!context.createScriptProcessor){
node = context.createJavaScriptNode(4096, 2, 2);
} else {
node = context.createScriptProcessor(4096, 2, 2);
}
// listen to the audio data, and record into the buffer
node.onaudioprocess = function(e){
recBuffersL.push(e.inputBuffer.getChannelData(0));
recBuffersR.push(e.inputBuffer.getChannelData(1));
recLength += e.inputBuffer.getChannelData(0).length;
}
// connect the ScriptProcessorNode with the input audio
source.connect(node);
// if the ScriptProcessorNode is not connected to an output the "onaudioprocess" event is not triggered in chrome
node.connect(context.destination);
},
function(e) {
// do something about errors
});
Rather than building all of this yourself I suggest you use the AudioRecorder code, which is awesome. It also handles writing the buffer to a WAV file. Here is a demo.
Here's another great resource.
for browsers that support MediaRecorder API, use it.
for older browsers that does not support MediaRecorder API, there are three ways to do it
as wav
all code client-side.
uncompressed recording.
source code --> http://github.com/mattdiamond/Recorderjs
as mp3
all code client-side.
compressed recording.
source code --> http://github.com/Mido22/mp3Recorder
as opus packets (can get output as wav, mp3 or ogg)
client and server(node.js) code.
compressed recording.
source code --> http://github.com/Mido22/recordOpus
You could check this site:
https://webaudiodemos.appspot.com/AudioRecorder/index.html
It stores the audio into a file (.wav) on the client side.
There is a bug that currently does not allow audio only. Please see http://code.google.com/p/chromium/issues/detail?id=112367
Currently, this is not possible without sending the data over to the server side. However, this would soon become possible in the browser if they start supporting the MediaRecorder working draft.

Categories