I'm beginner with Web Audio and wanted to know, if compressed (AAC, MP3, OGG) sound effects or especially music tracks are expanded in memory into megabytes, like textures in WebGL?
For example if I have an audio file with some music and file size is around ~3 MB (in any of the formats: MP3, OGG, AAC), will the file be expanded (decompressed) into 70 MB for playback like the original PCM format would take?
Also, is it possible to estimate how much memory an audio file uses when using Web Audio, with regular playback, without additional sound effects of the Web Audio's more advanced nodes.
Currently, to use a compressed audio file you must load the compressed audio file into memory and then use decodeAudioData to convert the compressed file to an AudioBuffer, consisting of float arrays internally---essentially PCM. (Firefox, however, has an optimization where, in many cases, it can use arrays of 16-bit integers instead of floats.)
If you use the number of channels and the duration of an AudioBuffer, you can get a pretty good estimate of the memory used.
If this is not suitable for you uses, the only alternative is to use a MediaElementAudioSourceNode and friends to stream the compressed file to the browser. You lose sample-accurate control of the source, however.
Related
So here is my problem. I want to play audio from nodejs running on a raspberry Pi and then adjust the brightness of an LED strip also connected to the same PI based on the frequency readings from the audio file. However I can't seem to find anything in node that gives the same functionality as the WebAudio API AnalyserNode.
I found a few libraries (https://www.npmjs.com/package/audio-render) that come close and are based on Web Audio API but the frequency values it produces are completely incorrect. I verified this by comparing it to a browser version I created using the Web Audio API.
I need the audio to play from node while also being analyzed to affect the brightness levels.
Any help would be appreciated. I really thought this would be simpler to handle in node but 6 hours later and I'm still without a solution.
Victor Dibiya at IBM has an excellent example that illustrates how to use the web-audio-api module to decode an audio file into a buffer array of PCM data from which one can extract amplitude data from sound files and infer beats:
https://github.com/victordibia/beats
I have this working on a Raspberry Pi with LEDs controlled via Fadecandy.
Complete question: Why is it more suitable to use a MediaElementAudioSourceNode rather than an AudioBuffer for longer sounds?
From MDN:
Objects of these types are designed to hold small audio snippets, typically less than 45 s. For longer sounds, objects implementing the MediaElementAudioSourceNode are more suitable.
From the specification:
This interface represents a memory-resident audio asset (for one-shot sounds and other short audio clips). Its format is non-interleaved 32-bit linear floating-point PCM values with a normal range of [−1,1][−1,1], but values are not limited to this range. It can contain one or more channels. Typically, it would be expected that the length of the PCM data would be fairly short (usually somewhat less than a minute). For longer sounds, such as music soundtracks, streaming should be used with the audio element and MediaElementAudioSourceNode.
What are the benefits of using a MediaElementAudioSourceNode over of an AudioBuffer?
Are there any disadvantages when using a MediaElementAudioSourceNode for short clips?
MediaElementSourceNode has the potential ability to stream - and certainly to start playing before the entire sound file has been downloaded and decoded. It also has the ability to do this without converting (likely expanding!) the sound file to 32-bit linear PCM (CD quality audio would only be 16 bits per channel) and transcoding to the output device sample rate. For example, a 1-minute podcast recorded at 16-bit, 16kHz would be just under 2 megabytes in size natively; if you're playing back on a 48kHz device (not uncommon), the transcoding to 32-bit 48kHz would mean you're using up nearly 12 megabytes as an AudioBuffer.
MediaElementSourceNode won't give you precise playback timing, or the ability to manage/playback lots of simultaneous sounds. The precision may be reasonable for your use case, but it won't be sample-accurate timing like AudioBuffer can have.
I've discovered (at least in chrome) web audio resamples wav files to 48k when using decodeAudioData. Any way to prevent this and force it to use the file's original sample rate? I'm sure this is fine for game development, but i'm trying to write some audio editing tools and this sort of thing isn't cool. I'd like to be fully in control of when/if resampling occurs.
As far as I know, you're just going to get whatever sampling rate your AudioContext is using (which will be determined by your sound card, I believe).
They lay out the steps here: https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#dfn-decodeAudioData
"The decoding thread will take the result, representing the decoded linear PCM audio data, and resample it to the sample-rate of the AudioContext if it is different from the sample-rate of audioData. The final result (after possibly sample-rate converting) will be stored in an AudioBuffer."
Nope, you can't prevent the resampling of decodeAudioData into the AudioContext's sampleRate. Load and create AudioBuffers yourself, or decode the file into a buffer in an OfflineAudioContext that is fixed to the rate it was originally set to (although it's going to be hard to tell what that is, I imagine).
There is discussion on this point - https://github.com/WebAudio/web-audio-api/issues/30.
There is now an webcomponent for loading audio using sox :
https://www.npmjs.com/package/sox-element
It allows you to decode audio at the original sample rate. The data is unaltered form the original.
I am getting 16 bit audio from server and currently sending it from server.
It is interleaved.
That means I need to loop and separate right and left into 2 32 bit arrays in javascript.
Well this is just too slow for javascript to execute and schedule the play time. things get out of sync. This is a live stream. So web api seems to be implemented for only local syths and such. Streaming pcm does not seem to be a good approach.I know that you would never send PCM to begin with. Well say i wanna send vorbis or something similar. They have to be in a container like .ogg or webm or something but browsers have their internal buffering and we have very little /no control.
so O tried sending ADPCM and decoding it to PCM on client in Javascript. That is also too slow.
If i send my data and preprocess it on server. Uninterleave onserver and convert it to 32 bit floats and send it to client. The data size doubles. 16 bit to 32 bit.
So what is the best way to render 16 audio with out processing on client side.
Also can you play audio from worker threads. Would implementing the conversions in a worker thread help. I mean there is all that websocket communication going on and JS is single threaded.
I would also Like to add that doing the computation on chrome on mac pro works (much better, I almost hear no glitch between samples) when I compare it to a client running on a pc,
No, you cannot currently play audio from Worker threads. However, I really doubt that your problem is in the cost of de-interleaving audio data; have you tried just sending a mono stream? Properly synchronizing and buffering a stream in a real network environment is quite complex.
The Web Audio API seems cool, but I'd really like to use it to processes audio files and then save them as wav files again, and I don't really need to listen to them while they are processing. Is this possible? Is there something like encodeAudioData() to turn the audio buffer back into an ArrayBuffer so I could put it back in a file?
Edit: recorderJS seems almost perfect, but it only outputs 16-bit wavs. Any chance there is something that can do pro-audio formats (24-bit or 32-bit float)?
In Web Audio API specification, there is the Offline Context which does exactly what you need.
OfflineAudioContext is a particular type of AudioContext for rendering/mixing-down (potentially) faster than real-time. It does not render to the audio hardware, but instead renders as quickly as possible, calling a completion event handler with the result provided as an AudioBuffer.