speex splitted audio data - WebAudio - VOIP

speex splitted audio data - WebAudio - VOIP - javascript

Im running a little app that encodes and decodes an audio array with the speex codec in javascript: https://github.com/dbieber/audiorecorder
with a small array filled with a sin waveform
for(var i=0;i<16384;i++)
data.push(Math.sin(i/10));
this works. But I want to build a VOIP application and have more than one array. So if I split my array up in 2 parts encode>decode>merge, it doesn't sound the same as before.
Take a look at this:
fiddle: http://jsfiddle.net/exh63zqL/
Both buttons should give the same audio output.
How can i get the same output in both ways ? Is their a special mode in speex.js for split audio data?

Speex is a lossy codec, so the output is only an approximation of your initial sine wave.
Your sine frequency is about 7 KHz, which is near the upper codec 8KHz bandwith and as such even more likely to be altered.
What the codec outputs looks like a comb of dirach pulses that will sound like your initial sinusoid as heard through a phone, which is certainly different from the original.
See this fiddle where you can listen to what the codec makes of your original sine waves, be them split in half or not.
//Generate a continus sinus in 2 arrays
var len = 16384;
var buffer1 = [];
var buffer2 = [];
var buffer = [];
for(var i=0;i<len;i++){
buffer.push(Math.sin(i/10));
if(i < len/2)
buffer1.push(Math.sin(i/10));
else
buffer2.push(Math.sin(i/10));
}
//Encode and decode both arrays seperatly
var en = Codec.encode(buffer1);
var dec1 = Codec.decode(en);
var en = Codec.encode(buffer2);
var dec2 = Codec.decode(en);
//Merge the arrays to 1 output array
var merge = [];
for(var i in dec1)
merge.push(dec1[i]);
for(var i in dec2)
merge.push(dec2[i]);
//encode and decode the whole array
var en = Codec.encode(buffer);
var dec = Codec.decode(en);
//-----------------
//Down under is only for playing the 2 different arrays
//-------------------
var audioCtx = new window.AudioContext || new window.webkitAudioContext;
function play (sound)
{
var audioBuffer = audioCtx.createBuffer(1, sound.length, 44100);
var bufferData = audioBuffer.getChannelData(0);
bufferData.set(sound);
var source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioCtx.destination);
source.start();
}
$("#o").click(function() { play(dec); });
$("#c1").click(function() { play(dec1); });
$("#c2").click(function() { play(dec2); });
$("#m").click(function() { play(merge); });
If you merge the two half signal decoder outputs, you will hear an additional click due to the abrupt transition from one signal to the other, sounding basically like a relay commutation.
To avoid that you would have to smooth the values around the merging point of your two buffers.

Note that Speex is a lossy codec. So, by definition, it can't give same result as the encoded buffer. Besides, it designed to be a codec for voice. So the 1-2 kHz range will be the most efficient as it expects a specific form of signal. In some way, it can be compared to JPEG technology for raster images.
I've modified slightly your jsfiddle example so you can play with different parameters and compare results. Just providing a simple sinusoid with an unknown frequency is not a proper way to check a codec. However, in the example you can see different impact on the initial signal at different frequency.
buffer1.push(Math.sin(2*Math.PI*i*frequency/sampleRate));
I think you should build an example with a recorded voice and compare results in this case. It would be more proper.
In general to get the idea in detail you would have to examine digital signal processing. I can't even provide a proper link since it is a whole science and it is mathematically intensive. (the only proper book for reading I know is in Russian). If anyone here with strong mathematics background can share proper literature for this case I would appreciate.
EDIT: as mentioned by Kuroi Neko, there is a trouble with the boundaries of the buffer. And seems like it is impossible to save decoder state as mentioned in this post, because the library in use doesn't support it. If you look at the source code you see that they use a third party speex codec and do not provide full access to it's features. I think the best approach would be to find a decent library for speex that supports state recovery similar to this

Related

Dividing one audio signal by another one

Short version
I need to divide an audio signal by another one (amplitude-wise). How could I accomplish this in the Web Audio API, without using ScriptProcessorNode? (with ScriptProcessorNode the task is trivial, but it is completely unusable for production due to the inherent performance issues)
Long version
Consider two audio sources, two OscillatorNodes for example, oscA and oscB:
var oscA = audioCtx.createOscillator();
var oscB = audioCtx.createOscillator();
Now, consider that these oscillators are LFOs, both with low (i.e. <20Hz) frequencies, and that their signals are used to control a single destination AudioParam, for example, the gain of a GainNode. Through various routing setups, we can define mathematical operations between these two signals.
Addition
If oscA and oscB are both directly connected to the destination AudioParam, their outputs are added together:
var dest = audioCtx.createGain();
oscA.connect(dest.gain);
oscB.connect(dest.gain);
Subtraction
If the output of oscB is first routed through another GainNode with a gain of -1, which is then connected to the destination AudioParam, then the output of oscB is effectively subtracted from that of oscA, because we are effectively doing an oscA + -oscB op. Using this trick we can subtract one signal from another one:
var dest = audioCtx.createGain();
var inverter = audioCtx.createGain();
oscA.connect(dest.gain);
oscB.connect(inverter);
inverter.gain = -1;
inverter.connect(dest.gain);
Multiplication
Similarly, if the output of oscA is connected to another GainNode, and the output of oscB is connected to the gain AudioParam of that GainNode, then oscB is multiplying the signal of oscA:
var dest = audioCtx.createGain();
var multiplier = audioCtx.createGain();
oscA.connect(multiplier);
oscB.connect(multiplier.gain);
multiplier.connect(dest.gain);
Division (?)
Now, I want the output of oscB to divide the output of oscA. How do I do this, without using ScriptProcessorNode?
Edit
My earlier, absolutely ridiculous attempts at solving this problem were:
Using a PannerNode to control the positionZ param, which did yield a result that decreased as signal B (oscB) increased, but it was completely off (e.g. it yielded 12/1 = 8.5 and 12/2 = 4.2) -- now this value can be compensated for by using a GainNode with its gain set to 12 / 8.48528099060058593750 (approximation), but it only supports values >=1
Using an AnalyserNode to rapidly sample the audio signal and then use that value (LOL)
Edit 2
The reason why the ScriptProcessorNode is essentially useless for applications more complex than a tech demo is that:
it executes audio processing on the main thread (!), and heavy UI work will introduce audio glitches
a single, dead simple ScriptProcessorNode will take 5% CPU power on a modern device, as it performs processing with JavaScript and requires data to be passed between the audio thread (or rendering thread) and the main thread (or UI thread)

It should be noted, that ScriptProcessorNode is deprecated.
If you need A/B, therefore you need 1/B, inverted signal. You can use WaveShaperNode to make the inversion. This node needs an array of corresponding values. Inversion means that -1 becomes -1, -0.5 becomes -2 etc.
In addition, make sure that you are aware of division by zero. You have to handle it. In the following code I just take the next value after zero and double it.
function makeInverseCurve( ) {
var n_samples = 44100,
curve = new Float32Array(n_samples),
x;
for (var i = 0 ; i < n_samples; i++ ) {
x = i * 2 / n_samples - 1;
// if x = 0, let reverse value be twice the previous
curve[i] = (i * 2 == n_samples) ? n_samples : 1 / x;
}
return curve;
};
Working fiddle is here. If you remove .connect(distortion) out of the audio chain, you see a simple sine wave. Visualization code got from sonoport.

WebAudio - Oscillator setPeridiocWave

I create three different linear chirps using the code found here on SO. With some other code snippets I save those three sounds as separate .wav files. This works so far.
Now I want to play those three sounds at the exact same time. So I thought I could use the WebAudio API, feed three oscillator nodes with the float arrays I got from the code above.
But I don't get at least one oscillator node to play its sound.
My code so far (shrinked for one oscillator)
var osc = audioCtx.createOscillator();
var sineData = linearChirp(freq, (freq + signalLength), signalLength, audioCtx.sampleRate); // linearChirp from link above
// sine values; add 0 at the front because the docs states that the first value is ignored
var imag = Float32Array.from(sineData.unshift(0));
var real = new Float32Array(imag.length); // cos values
var customWave = audioCtx.createPeriodicWave(real, imag);
osc.setPeriodicWave(customWave);
osc.start();
At the moment I suppose that I do not quite understand the whole the math behind the peridioc wave.
The code that plays the three sounds at the same time works (with simple sin values in the oscillator nodes), so I assume that the problem is my peridioc wave.
Another question: is there a different way? Maybe like using three MediaElementAudioSourceNode that are linked to my three .wav files. I don't see a way to play them at the exact same time.

The PeriodicWave isn't a "stick a waveform in here and it will be used as a single oscillation" feature - it builds a waveform through specifying the relative strengths of various harmonics. Note that in that code you pointed to, they create a BufferSource node and point its .buffer to the results of linearchirp(). You can do that, too - just use BufferSource nodes to play back the linearshirp() outputs, which (I think?) are just sine waves anyway? (If so, you could just use an oscillator and skip that whole messy "create a buffer" bit.)
If you just want to play back the buffers you've created, use BufferSource. If you want to create complex harmonics, use PeriodicWave. If you've created a single-cycle waveform and you want to play it back as a source waveform, use BufferSource and loop it.

WebAudio sounds from wave point

Suppose that I make a simple canvas drawing app like this:
I now have a series of points. How can I feed them to some of the WebAudio objects (an oscillator or a sound make from a byte array or something) to actually generate and play a wave out of them (in this case a sine-like wave)? What is the theory behind it?

If you have the data from your graph in an array, y, you can do something like
var buffer = context.createBuffer(1, y.length, context.sampleRate);
buffer.copyToChannel(y);
var src = context.createBufferSource();
src.buffer = buffer;
src.start()
You may need to set the sample rate in context.createBuffer to something other than context.sampleRate, depending on the data from your graph.

Get Final Output Frequency of Chained Oscillators

I've set up a web page with a theremin and I'm trying to change the color of a web page element based on the frequency of the note being played. The way I'm generating sound right now looks like this:
osc1 = page.audioCX.createOscillator();
pos = getMousePos(page.canvas, ev);
osc1.frequency.value = pos.x;
gain = page.audioCX.createGain();
gain.gain.value = 60;
osc2 = page.audioCX.createOscillator();
osc2.frequency.value = 1;
osc2.connect(gain);
gain.connect(osc1.frequency);
osc1.connect(page.audioCX.destination);
What this does is oscillate the pitch of the sound created by osc1. I can change the color to the frequency of osc1 by using osc1.frequency.value, but this doesn't factor in the changes applied by the other parts.
How can I get the resultant frequency from those chained elements?

You have to do the addition yourself (osc1.frequency.value + output of gain).
The best current (but see below) way to get access to the output of gain is probably to use a ScriptProcessorNode. You can just use the last sample from each buffer passed to the ScriptProcessorNode, and set the buffer size based on how frequently you want to update the color.
(Note on ScriptProcessorNode: There is a bug in Chrome and Safari that makes ScriptProcessorNode not work if it doesn't have at least one output channel. You'll probably have to create it with one input and one output, have it send all zeros to the output, and connect it to the destination, to get it to work.)
Near-future answer: You can also try using an AnalyserNode, but under the current spec, the time domain data can only be read from an AnalyserNode as bytes, which means the floating point samples are being converted to be in the range [0, 255] in some unspecified way (probably scaling the range [-1, 1] to [0, 255], so the values you need would be clipped). The latest draft spec includes a getFloatTimeDomainData method, which is probably your cleanest solution. It seems to have already been implemented in Chrome, but not Firefox, as far as I can tell.

Upsampling Audio PCM-data in Javascript with WebAudioApi

For a project, I'm retrieving a live audio stream via WebSockets from a Java server. On the server, I'm processing the samples in 16Bit/8000hz/mono in the form of 8-bit signed byte values (with two bytes making up one sample). On the browser, however, the lowest supported samplerate is 22050 hz. So my idea was to "simply" upsample the existing 8000 to 32000 hz, which is supported and seems to me like an easy calculation.
So far, I've tried linear upsampling and cosine interpolation, but both didn't work. In addition to sounding really distorted, the first one also added some clicking noises. I might also have trouble with the WebAudioAPI in Chrome, but at least the sound is playing and is barely recognizable as what it should be. So I guess no codec- or endianess-problem.
Here's the complete code that gets executed when a binary packet with sound data is received. I'm creating new buffers and buffersources all the time for the sake of simplicity (yeah, no good for performance). data is an ArrayBuffer. First, I'm converting the samples to Float, then I'm upsampling.
//endianess-aware buffer view
var bufferView=new DataView(data),
//the audio buffer to set for output
buffer=_audioContext.createBuffer(1,640,32000),
//reference to underlying buffer array
buf=buffer.getChannelData(0),
floatBuffer8000=new Float32Array(160);
//16Bit => Float
for(var i=0,j=null;i<160;i++){
j=bufferView.getInt16(i*2,false);
floatBuffer8000[i]=(j>0)?j/32767:j/-32767;
}
//convert 8000 => 32000
var point1,point2,point3,point4,mu=0.2,mu2=(1-Math.cos(mu*Math.PI))/2;
for(var i=0,j=0;i<160;i++){
//index for dst buffer
j=i*4;
//the points to interpolate between
point1=floatBuffer8000[i];
point2=(i<159)?floatBuffer8000[i+1]:point1;
point3=(i<158)?floatBuffer8000[i+2]:point1;
point4=(i<157)?floatBuffer8000[i+3]:point1;
//interpolate
point2=(point1*(1-mu2)+point2*mu2);
point3=(point2*(1-mu2)+point3*mu2);
point4=(point3*(1-mu2)+point4*mu2);
//put data into buffer
buf[j]=point1;
buf[j+1]=point2;
buf[j+2]=point3;
buf[j+3]=point4;
}
//playback
var node=_audioContext.createBufferSource(0);
node.buffer=buffer;
node.connect(_audioContext.destination);
node.noteOn(_audioContext.currentTime);

Finally found a solution for this. The conversion from 16Bit to Float is wrong, it just needs to be
floatBuffer8000[i]=j/32767.0;
Also, feeding the API with a lot of small samples doesn't work well, so you need to buffer some samples and play them together.

We Keep Coding

JavaScript is the programming language of the Web.