I've created a getSpectrum method using the getByteFrequencyData method on the Web Audio API's Analyser Node. The array of audio data returned is relative to the audio source's (either an el, or Audio() instance) volume, a value from 0 to 1.
Using the audio source's volume I'm trying to normalize each value received from getByteFrequencyData so that the user of getSpectrum doesn't have to worry about volume when they're visualizing the audio data.
This is the striped down version of getSpectrum
var audioData = new Uint8Array(analyser.binCount);
var spectrum = [];
analyser.getByteFrequencyData(audioData);
for (var i = 0; i < audioData.length; i++) {
var value = audioData[i];
//Where I'm trying to calculate the value independent of volume
value = ((value / audioEl.volume) / 255);
spectrum.push(value);
}
return spectrum;
The W3C spec references the equation used to calculate the returned value given a maxDecibels and minDecibels. With my rudimentary understanding, I've tried to inverse the math so I get a normalized value, but I can't getting it working exactly right. I'm having trouble accomplishing this with just a volume value from 0 to 1.
Any incite would be greatly appreciated! Heres a working example of the issue. Changing the volume slider will illustrate the problem.
Update 7/22/16: Thanks to #raymond-toy's answer I figured out how to convert the 0 to 1 volume value to decibels.
volumeDB = Math.abs((Math.log(volume)/Math.LN10)*20);
After getting the DB, I inversed the equation in the W3C spec,
value = ((audioDataValue * volumeDB) / 255) - volumeDB
Unfortunately, value somehow still ends up relative to volume. Does anyone see what I'm missing?
getByteFrequencyData returns values in dB. You don't want to divide these values by the audioE1.volume. You want to convert (somehow!) audioE1.volume to a dB value and add (or subtract) that from values from getByteFrequencyData
It might be easier to understand things if you used getFloatFrequencyData first to see what's happening.
Apparently I was on a fool's errand. As #raymond-toy pointed out, Spectrum values are implicitly relative to volume. Normalizing would mean losing a portion of data "off the bottom of the spectrum", which was not my goal.
If anyone's curious, I ended up just dividing the audioDataValue by 255, getting a float from 0 to 1.
Related
I would like to work with percentages while doing some FFT with the web audio API.
To do so I need to know the range of the values the analyser.getByteFrequencyData returns.
I can't find anything about that, maybe someone knows?
Thanks
analyser.getByteFrequencyData returns a normalized array of values between 0 and 255.
The length of the array is half the value of analyzer.fftSize.
So if analyzer.fftSize = 1024 analyser.getByteFrequencyData will return an array with 512 values.
Also see https://stackoverflow.com/a/14789992/4303873 for more information.
I've set up a web page with a theremin and I'm trying to change the color of a web page element based on the frequency of the note being played. The way I'm generating sound right now looks like this:
osc1 = page.audioCX.createOscillator();
pos = getMousePos(page.canvas, ev);
osc1.frequency.value = pos.x;
gain = page.audioCX.createGain();
gain.gain.value = 60;
osc2 = page.audioCX.createOscillator();
osc2.frequency.value = 1;
osc2.connect(gain);
gain.connect(osc1.frequency);
osc1.connect(page.audioCX.destination);
What this does is oscillate the pitch of the sound created by osc1. I can change the color to the frequency of osc1 by using osc1.frequency.value, but this doesn't factor in the changes applied by the other parts.
How can I get the resultant frequency from those chained elements?
You have to do the addition yourself (osc1.frequency.value + output of gain).
The best current (but see below) way to get access to the output of gain is probably to use a ScriptProcessorNode. You can just use the last sample from each buffer passed to the ScriptProcessorNode, and set the buffer size based on how frequently you want to update the color.
(Note on ScriptProcessorNode: There is a bug in Chrome and Safari that makes ScriptProcessorNode not work if it doesn't have at least one output channel. You'll probably have to create it with one input and one output, have it send all zeros to the output, and connect it to the destination, to get it to work.)
Near-future answer: You can also try using an AnalyserNode, but under the current spec, the time domain data can only be read from an AnalyserNode as bytes, which means the floating point samples are being converted to be in the range [0, 255] in some unspecified way (probably scaling the range [-1, 1] to [0, 255], so the values you need would be clipped). The latest draft spec includes a getFloatTimeDomainData method, which is probably your cleanest solution. It seems to have already been implemented in Chrome, but not Firefox, as far as I can tell.
When creating a sound with AudioBufferSourceNode I can set the offset and the duration in seconds.
I have the offset and duration in sample positions which I suppose I have to convert to time, and I don't know where to start. Is it possible to get an exact match?
There seems there was sample offset and length in an earlier version of web-api but no more.
From the documentation: (w3c)
Please note that as a low-level implementation detail, the AudioBuffer
is at a specific sample-rate (usually the same as the AudioContext
sample-rate), and that the loop times (in seconds) must be converted
to the appropriate sample-frame positions in the buffer according to
this sample-rate.
the match should be exact, just divide your sample-position by the sample-rate,
second_offset = sample_offset / sample_rate
and
second_duration = sample_duration / sample_rate
The Web Audio API has an analyser node which allows you to get FFT data on the audio you're working with and has byte and float ways of getting the data. The byte version makes a bit of sense, returning what looks like a normalized (depending on min and max decibel values) intensity spectrum with 0 being no component of the audio at a specific frequency bin and 255 being the max.
But I'd like a bit more detail than 8 bit, using the float version however, gives weird results.
freqData = new Float32Array(analyser.frequencyBinCount);
analyser.getFloatFrequencyData(freqData);
This gives me values between -891.048828125 and 0. -891 shows up corresponding to silence, so it's somehow the minimum value while I'm guessing 0 is equivalent to the max value.
What's going on? Why is -891.048828125 significant at all? Why a large negative being silence and zero being maximum? Am I using the wrong FloatArray or is there misconfiguration? Float64 gives 0 values.
Since there seems to be zero documentation on what the data actually represents, I looked into the relevant source code of webkit: RealtimeAnalyser.cpp
Short answer: subtract analyser.minDecibels from every value of the Float32Array to get positive numbers and multiply with (analyser.maxDecibels - analyser.minDecibels) to get a similar representation as with getByteFrequencyData, just with more resolution.
Long answer:
Both getByteFrequencyData and getFloatFrequencyData give you the magnitude in decibels. It's just scaled differently and for getByteFrequencyData a minDecibels constant is subtracted:
Relevant code in webkit for getByteFrequencyData:
const double rangeScaleFactor = m_maxDecibels == m_minDecibels ? 1 : 1 / (m_maxDecibels - m_minDecibels);
float linearValue = source[i];
double dbMag = !linearValue ? minDecibels : AudioUtilities::linearToDecibels(linearValue);
// The range m_minDecibels to m_maxDecibels will be scaled to byte values from 0 to UCHAR_MAX.
double scaledValue = UCHAR_MAX * (dbMag - minDecibels) * rangeScaleFactor;
Relevant code in webkit for getFloatFrequencyData:
float linearValue = source[i];
double dbMag = !linearValue ? minDecibels : AudioUtilities::linearToDecibels(linearValue);
destination[i] = float(dbMag);
So, to get positive values, you can simply subtract minDecibels yourself, which is exposed in the analyzer node:
//The minimum power value in the scaling range for the FFT analysis data for conversion to unsigned byte values.
attribute double minDecibels;
Another detail is that by default, the analyser node does time smoothing, which can be disabled by setting smoothingTimeConstant to zero.
The default values in webkit are:
const double RealtimeAnalyser::DefaultSmoothingTimeConstant = 0.8;
const double RealtimeAnalyser::DefaultMinDecibels = -100;
const double RealtimeAnalyser::DefaultMaxDecibels = -30;
Sadly, even though the analyser node computes a complex fft, it doesn't give access to the complex representations, just the magnitudes of it.
Correct on both points in the previous answer and comments - the numbers are in decibels, so 0 is max and -infinity is min (absolute silence). -891.0... is, I believe, just a floating point conversion oddity.
You are correct in using a Float32Array. I found an interesting tutorial on using the Audio Data API, which while it is different than the Web Audio API, gave me some useful insight to me about what you are trying to do here. I had a quick peek to see about why the numbers are negative, and didn't notice anything obvious, but I wondered if these numbers might be in decibels, dB, which commonly is given in negative numbers, and zero is the peak. The only problem with that theory is that -891 seems to be a really small number for dB.
Is there a possibility to render an visualization of an audio file?
Maybe with SoundManager2 / Canvas / HTML5 Audio?
Do you know some technics?
I want to create something like this:
You have a tone of samples and tutorials here : http://www.html5rocks.com/en/tutorials/#webaudio
For the moment it work in the last Chrome and the last last Firefox (Opera ?).
Demos : http://www.chromeexperiments.com/tag/audio/
To do it now, for all visitors of a web site, you can check SoundManagerV2.js who pass through a flash "proxy" to access audio data http://www.schillmania.com/projects/soundmanager2/demo/api/ (They already work on the HTML5 audio engine, to release it as soon as majors browsers implement it)
Up to you for drawing in a canvas 3 differents audio data : WaveForm, Equalizer and Peak.
soundManager.defaultOptions.whileplaying = function() { // AUDIO analyzer !!!
$document.trigger({ // DISPATCH ALL DATA RELATIVE TO AUDIO STREAM // AUDIO ANALYZER
type : 'musicLoader:whileplaying',
sound : {
position : this.position, // In milliseconds
duration : this.duration,
waveformDataLeft : this.waveformData.left, // Array of 256 floating-point (three decimal place) values from -1 to 1
waveformDataRight: this.waveformData.right,
eqDataLeft : this.eqData.left, // Containing two arrays of 256 floating-point (three decimal place) values from 0 to 1
eqDataRight : this.eqData.right, // ... , the result of an FFT on the waveform data. Can be used to draw a spectrum (frequency range)
peakDataLeft : this.peakData.left, // Floating-point values ranging from 0 to 1, indicating "peak" (volume) level
peakDataRight : this.peakData.right
}
});
};
With HTML5 you can get :
var freqByteData = new Uint8Array(analyser.frequencyBinCount);
var timeByteData = new Uint8Array(analyser.frequencyBinCount);
function onaudioprocess() {
analyser.getByteFrequencyData(freqByteData);
analyser.getByteTimeDomainData(timeByteData);
/* draw your canvas */
}
Time to work ! ;)
Run samples through an FFT, and then display the energy within a given range of frequencies as the height of the graph at a given point. You'll normally want the frequency ranges going from around 20 Hz at the left to roughly the sampling rate/2 at the right (or 20 KHz if the sampling rate exceeds 40 KHz).
I'm not so sure about doing this in JavaScript though. Don't get me wrong: JavaScript is perfectly capable of implementing an FFT -- but I'm not at all sure about doing it in real time. OTOH, for user viewing, you can get by with around 5-10 updates per second, which is likely to be a considerably easier target to reach. For example, 20 ms of samples updated every 200 ms might be halfway reasonable to hope for, though I certainly can't guarantee that you'll be able to keep up with that.
http://ajaxian.com/archives/amazing-audio-sampling-in-javascript-with-firefox
Check out the source code to see how they're visualizing the audio
This isn't possible yet except by fetching the audio as binary data and unpacking the MP3 (not JavaScript's forte), or maybe by using Java or Flash to extract the bits of information you need (it seems possible but it also seems like more headache than I personally would want to take on).
But you might be interested in Dave Humphrey's audio experiments, which include some cool visualization stuff. He's doing this by making modifications to the browser source code and recompiling it, so this is obviously not a realistic solution for you. But those experiments could lead to new features being added to the <audio> element in the future.
For this you would need to do a Fourier transform (look for FFT) which will be slow in javascript, and not possible in realtime at present.
If you really want to do this in the browser, I would suggest doing it in java/silverlight, since they deliver the fastest number crunching speed in the browser.