Javascript library for recording user audio?

Javascript library for recording user audio? - javascript

This isn't another one of those "How can I record audio in the browser?" questions... I know that the HTML5 Stream API is around the corner and Flash can already access the user's microphone and camera. I'm simply wondering, as a Javascript developer with little knowledge of Flash, if anyone has developed a JS library that hooks into Flash's device capabilities for recording but sends the results back to javascript (presumably using ExternalInterface).
In other words... libraries like SoundManager2 utilize a Flash fallback for audio playback, but they don't seem to allow for recording. Has anyone written a JS library that uses an invisible Flash movie to allow audio recording?

This does most of what you're looking for:
https://code.google.com/p/wami-recorder/
It records audio and sends it to a server via an HTTP POST (avoiding the need for a Flash Media Server.) A JavaScript API is available via ExternalInterface.
I'm not sure why you'd want the audio bytes in JavaScript, but it would probably be easy to modify it to do that too.

Unfortunately, you can't really do Flash audio recording in browser only. The Flash audio interfaces are all designed (surprise surprise) to talk to a Flash media server (or Red5): there is no interface to store recorded audio data locally and pass the recorded audio data to Javascript.
Once you have Red5/FMS setup you can control the recording process from Javascript: you can start/stop/playback the audio stream to/from the server. However, for security reasons you have to have a flash movie that is a minimum of 216 x 138 (see http://blog.natebeck.net/2009/01/tip-of-the-day-tricks-of-the-mic-settings-panel/ for a writeup) otherwise the settings manager won't be shown: this prevents people hiding an audio recording flash widget on a page and eavesdropping.
So no, no invisible flash controlled from javascript.

Related

Web Audio Api integration with Web Speech Api - stream speaker/soundcard output to voice recognition api

Problem:
Ideally I would acquire the streaming output from the soundcard (generated by an mp4 file being played) and send it to both the microphone and speakers. I know I can use "getUserMedia" and "createChannelSplitter" (in the Web Audio Api) to acquire & split (based on Audacity analysis the original signal is in stereo) the user media into 2 outputs which leaves me with 2 problems.
getUserMedia can only get streaming input from the microphone
not from the soundcard (from what I have read)
streaming output can only be recorded/sent to a buffer and not sent
to the microphone directly (from what I have read)
Is this correct?
Possible workaround - stalled:
The user will most likely have a headset microphone on but one workaround I have thought of is to switch to the inbuilt microphone on the device and capture what comes out of the speakers and then switch back to the headset for user input. However, I haven't found a way to switch between the inbuilt microphone and the headset microphone without asking the user every time.
Is there a way to do this that I haven't found?
What other solutions would you suggest?
Project Explanation:
I am creating a Spanish language practice program/website written in html & javascript. An mp4 will play and the speech recognition api will display what it says on the screen (as it is spoken in Spanish) and it will be translated into english so the user hears, sees, and understands what is being said by the person speaking in the mp4. Then the user will use the headset microphone to answer the mp4 person (often the inbuilt microphone doesn't give good enough quality for voice recognition - depending on the device - thus the use of the headset).
flow chart of my workaround using inbuilt microphone
mp4->soundcard-> Web Audio Api -> channel 1 -> user's ears
channel 2 -> microphone input-> Web Speech Api-> html->text onscreen
flow chart of ideal situation skipping microphone input
mp4->soundcard-> Web Audio Api -> channel 1 -> user's ears
channel 2-> Web Speech Api-> html->text onscreen -> user's eyes
Another potential work around:
I would like to avoid having to manually strip an mp3 from each mp4 and then have to try and sync them so the voice recognition happens as the mp4 person speaks. I have read that I can run an mp3 through the voice recognition api.

The short answer is that there is not currently (12/19) a way to accomplish this on this platform with the tools and budget I have. I have opted for the laborious way to do this which is setting up individual divs with text blocks to be revealed as the person is speaking on a timer. I will still use the speech api to capture what the user says so the program can run the correct video in response.

Switching between speaker and user headset is a definite no go.
Speech recognition software usually requires clean and well captured audio. So, if the sound is coming from speakers, the users microphone is not likely to pick it up very well. And if the user is using headphones, then there is no way for the microphone to capture the audio at all.
As far as I know, you cannot send audio files Web Speech Api directly (I may be wrong here)
Web Speech Api Is not supported by all browsers so that is a downside to consider too: https://caniuse.com/#feat=speech-recognition
What I would recommend is checking out Google's Speech to text API: https://cloud.google.com/speech-to-text/
With this service you can send them directly the audio file and they will send back the transcription.
It does support streaming so you could have the audio transcribed at the same time it is playing. The timing wouldn't be perfect though.

How do in-browser audio players work?

I've doing Javascript programming for some time but it's always been related to data updating, saving, manipulating, etc.
I have no idea how something like an in-browser audio player gets audio (especially live, streaming audio) from the internet and plays it out of my computer speakers.
How does this happen in Javascript?
For example, how does a website deliver live audio to my speakers using Javascript? http://player.streamtheworld.com/liveplayer.php?callsign=WVIEAM

The live audio is not much different from pre-recorded audio... it's just played back as it's received, and when live it's encoded as it's recorded.
In browsers these days, the most basic form of streaming audio is a simple <audio> tag. By changing the src attribute from a file to a stream, you're up and running:
<audio src="http://cdn.audiopump.co/waug/main_mp3_256k" />
The browser doesn't know or care in this case that the audio is a live stream. All it knows is that there's some media data that it's fetching via HTTP, and playing back while it comes in.
If your browser compatibility is good, it would be preferable to use the MediaSource API, giving you more control (such as switching to a different quality stream mid-stream, like in HLS) and ensuring that the browser doesn't try to cache what is effectively an inifinitely sized file.
For example, how does a website deliver live audio to my speakers using Javascript? http://player.streamtheworld.com/liveplayer.php?callsign=WVIEAM
This particular site is ran by Triton Digital, and they still use Flash. Many sites still do this as a holdover from a time when HTML5 audio was not widely supported. There is little reason to do this today.
Other reasons to use Flash include incompatible server protocols. If your streaming server is using RTMP, you're stuck with Flash as browsers don't speak RTMP.
There used to be an issue with streaming AAC in-browser due to browsers not properly handling AAC wrapped in ADTS. (This encapsulation is required for streaming AAC in most situations.) Most browsers have resolved this, but I suspect that this is the reason Triton Digital is still using their Flash solution. By using Flash, they can play AAC/ADTS streams.

Live Streaming to Browser

I'm a bit at my wits end here.
I want to stream a live video broadcast to a web browser.
Currently I use ffmpeg to stream a directshow live source as a webm stream to node.js which then forwards the stream to the http request from the <video> element. So far everything works.
live source -> ffmpeg -> POST [webm] -> node.js -> GET [webm] -> video tag
My problem is that the source clock and the web clients clock doesn't exactly match each other (not that surprising). For video this is not a problem, dropping or duplicating a frame every now and then is not noticeable. However, with audio it is another issue. From what I've been able to figure out so far Chrome (or any other browser) does not perform any form of audio resampling compensation (e.g. swr_set_compensation from ffmpeg) to compensate this mismatch. Instead I get quite audible audio distortions (a loud beep) when the playback buffer runs out of samples.
My question is whether it is possible to achieve proper playback (with audio) of a live source in a web browser?
I haven't tried using silverlight or flash for playback yet. Would that possibly work better?

Live media (audio or/and video) streaming to a web browser has been possible for a couple of years though it is still making progress as of today. It is the next big thing for media on the web and many platforms like Youtube are already on board.
A typical live media streaming scenario is:
audio/video feed > transcoding > streaming > player
At each step you have several technological possibilities available. However I should already mention here that the road to live media streaming is paved with proprietary technologies.
audio/video feed: either raw or very lightly compressed media format and cannot be uploaded as such to the Internet. You need to transcode it. You may have to use a grabbing device like a PCI Express card or USB/thunderbolt device to get your cam onto a computer.
transcoding: you have software (ffmpeg, Flash media live encoder, Wirecast) or hardware solutions (streamingmedia.com has a wealth of information on the subject like here). H264/AAC is the current media professional standard and streams are often transcoded to multiple renditions (bitrate) to suit different network conditions.
streaming: you most likely need to target multiple devices to deliver your live stream. Not all devices support the same streaming protocol. HLS works on Apple devices and Android > 4.1. HDS or RTMP works in Flash, Smooth streaming in Silverlight. You cannot reach all devices with one protocol so in this case you would need a streaming server like Wowza or Red5. A streaming server take as an input a transcoded live stream and prepare it for cross device delivery while sustaining a massive number of simultaneous connections (over a thousand is not uncommon nowadays). It can also add functionalities like DVR or DRM. As of today the effort is around HTTP adaptive bitrate delivery. Large companies add CDN support for global delivery.
player: to display your live stream with various options like custom layout, closed captions, ads, chat module and more. Flash has been leading the market up until now for live media streaming on desktop. You can use HTML5 video for iOS and Android where HLS is supported.
Coming in fast is MPEG DASH and it works live with HTML5 video. There is a JS lib that supports live. I have tested it and it works though I may not use it for a production case scenario just yet as it is still a bit clunky (on demand support is better) and browser support is narrow at the moment (As of 8/30/13, Desktop Chrome, Desktop Internet Explorer 11, and Mobile Chrome Beta for Android are the only browsers supported).
I cannot comment much on your solution because I have not used node.js for streaming but it sounds like an interesting effort. A typical solution I would use relating to your case:
Device > ffmpeg (H264/AAC) > Wowza > Hybrid player (Flash + HTML5).
Instead of Wowza you could use Red5 (free/open source - but not much activity as of late). You can also look into Nginx RTMP module which supports HLS and MPEG DASH on top of RTMP.
For flash I use Strobe from Adobe which support live streaming and is easy to set up and a fallback to HTML5 where flash is not supported. I use SWFObject lib to detect flash support and feed a HLS URL to an HTML5 video tag for mobile devices. You can use RTSP for Android < 4.1 and other mobile devices.
Another thing I should mention is real time communications. For video/audio conferencing you could have a look at WebRTC. Those 2 articles should get you on the right track. Here and here. WebRTC will work great for one to few, one to one, few to few. If you need to support more concurrent connections you can have a look at Licode or tokbox.

HTML5 base64 encoded audio on mobile devices

I'm writing a platform with an audio playback component. Audio is uploaded to the server as an wav/mp3/ogg file, and then (like the rest of our media), converted to base64 and stored within our redis database.
To play the audio back at the client side we make an AJAX request to the server for the base64 encoded audio. We have a desktop version that compliments the mobile application, at the moment audio playback works like this:
recording.sound = new Audio("data:audio/ogg;base64," + recording.audio);
recording.sound.play(); // this works
Today we started our tests on mobile devices, and have so far been unable to get it working, even on mobile browsers that apparently support HTML5 audio.
Any ideas? Or if this is not possible, is there a different approach we can take? Ideally there should be a mobile compatible version of the web app, and there has to be a phonegap version.

The reason might not be a technical one here, from Apple developer site:
In Safari on iOS (for all devices, including iPad), where the user may be on a cellular network and be charged per data unit, preload and autoplay are disabled. No data is loaded until the user initiates it. This means the JavaScript play() and load() methods are also inactive until the user initiates playback, unless the play() or load() method is triggered by user action. In other words, a user-initiated Play button works, but an onLoad="play()" event does not.
same applies to Android devices.
read more here: Safari HTML5 Audio and Video Guide

But „audio/wav“ doesn't exist. See spec here: http://www.iana.org/assignments/media-types/audio
You should use „audio/vnd.dts“ for .wav file, „audio/mpeg“ for .mp3 file and „audio/ogg“ for .ogg file...
OK, try StackOverflow search, see:
https://stackoverflow.com/search?q=audio+codec+support+mobile+devices+html5
or https://stackoverflow.com/search?q=audio+codec+support+mobile+devices+html
or try Google
Some search results, that might be useful:
In search for a library that knows to detect support for specific Audio Video format files
or html5 vs flash - full comparison chart anywhere?

Can I use microphone and sound with javascript?

I would like to develop a very small application using javascript... this application should pass the voice recorded from a microphone to the sound.
Is it possible?
I know that i can access to microphone using Flash, but i would like to use javascript if possible.
Thank you!

Keep an eye on HTML5's implementation of getUserMedia. For a work-around using flash see:
https://code.google.com/p/wami-recorder/
That example actually passes audio to a server via an HTTP post (so no need for a Flash Media Server), but you could easily adapt it to keep the audio on the client side.

In this question about video streaming via web sockets it is possible to stream video. Theoretically it might be possible to write a client side application that creates a local TCP socket for microphone and audio, to which the browser and Javascript then listen.
I don't know if this has ever been attempted, and it would require significant code outside the browser to make happen.
You don't gain much either by doing it this way, over say, Flash since you still have client-side dependencies.

Nope. This is not possible. Javascript is not meant to access devices. you will need some abstraction technology like flash or silverlight that can help you with this otherwise javascript engine runs under the browser and it has no strings attached to the client-machine on which the browser is running.

We Keep Coding

JavaScript is the programming language of the Web.