I am currently playing around with audio visualization and I am trying to work with Spotify's Web Playback SDK to stream and analyze songs directly on my site.
However, I am unsure what the limitations are when it comes to actually reading the streamed data. I've noticed that an iframe is generated for the Spotify player, and I've read that spotify uses the encrypted media extensions to stream the audio on chrome.
Is it even possible to read the music data from the Spotify api? Maybe, I can read the output audio from the browser?
According to the web API documentation, you aren't able to play back full songs and get the audio data like you desire (for obvious reasons). Although, 30 second "song previews" are allowed through URL streaming, as well as full song playback on desktop browsers (excluding safari at the time of this post), with the Web Playback SDK.
However on the mobile API it is now possible to get the raw PCM data (Android or iOS). This will require you registering for a developers account and setting up the access tokens if you haven't already done so.
For quick reference, on Android it involves using the AudioController class.
EDIT : Thanks to #Leme for the Web Playback SDK link.
Related
I want to use the spotify api to build a simple audio player embedded into a chrome extension. I've integrated the auth flow and I'm able to get a token to use for the api requests. I'm looking at the documentation to find a way to play the user playlists or searched tracks but I'm not able to find any useful information. They have only this SDK but a premium account is needed.
Is possible to use the spotify uri to play a full track or playlist after authentication?
No, there's currently no way to play the full song using the Web API. If you want full tracks to be playable from a website, you can use the Spotify Play Button. If you want to build a mobile application, you can do playback of full tracks using the Android and/or iOS SDK.
I'm doing it right now using the Spotify Web API and this:
https://github.com/JMPerez/spotify-web-api-js
I'm running this locally to provide refresh tokens:
https://github.com/spotify/web-api-auth-examples
I'm still figuring it out, but I definitely have things playing in my browser on local.
Problem:
Ideally I would acquire the streaming output from the soundcard (generated by an mp4 file being played) and send it to both the microphone and speakers. I know I can use "getUserMedia" and "createChannelSplitter" (in the Web Audio Api) to acquire & split (based on Audacity analysis the original signal is in stereo) the user media into 2 outputs which leaves me with 2 problems.
getUserMedia can only get streaming input from the microphone
not from the soundcard (from what I have read)
streaming output can only be recorded/sent to a buffer and not sent
to the microphone directly (from what I have read)
Is this correct?
Possible workaround - stalled:
The user will most likely have a headset microphone on but one workaround I have thought of is to switch to the inbuilt microphone on the device and capture what comes out of the speakers and then switch back to the headset for user input. However, I haven't found a way to switch between the inbuilt microphone and the headset microphone without asking the user every time.
Is there a way to do this that I haven't found?
What other solutions would you suggest?
Project Explanation:
I am creating a Spanish language practice program/website written in html & javascript. An mp4 will play and the speech recognition api will display what it says on the screen (as it is spoken in Spanish) and it will be translated into english so the user hears, sees, and understands what is being said by the person speaking in the mp4. Then the user will use the headset microphone to answer the mp4 person (often the inbuilt microphone doesn't give good enough quality for voice recognition - depending on the device - thus the use of the headset).
flow chart of my workaround using inbuilt microphone
mp4->soundcard-> Web Audio Api -> channel 1 -> user's ears
channel 2 -> microphone input-> Web Speech Api-> html->text onscreen
flow chart of ideal situation skipping microphone input
mp4->soundcard-> Web Audio Api -> channel 1 -> user's ears
channel 2-> Web Speech Api-> html->text onscreen -> user's eyes
Another potential work around:
I would like to avoid having to manually strip an mp3 from each mp4 and then have to try and sync them so the voice recognition happens as the mp4 person speaks. I have read that I can run an mp3 through the voice recognition api.
The short answer is that there is not currently (12/19) a way to accomplish this on this platform with the tools and budget I have. I have opted for the laborious way to do this which is setting up individual divs with text blocks to be revealed as the person is speaking on a timer. I will still use the speech api to capture what the user says so the program can run the correct video in response.
Switching between speaker and user headset is a definite no go.
Speech recognition software usually requires clean and well captured audio. So, if the sound is coming from speakers, the users microphone is not likely to pick it up very well. And if the user is using headphones, then there is no way for the microphone to capture the audio at all.
As far as I know, you cannot send audio files Web Speech Api directly (I may be wrong here)
Web Speech Api Is not supported by all browsers so that is a downside to consider too: https://caniuse.com/#feat=speech-recognition
What I would recommend is checking out Google's Speech to text API: https://cloud.google.com/speech-to-text/
With this service you can send them directly the audio file and they will send back the transcription.
It does support streaming so you could have the audio transcribed at the same time it is playing. The timing wouldn't be perfect though.
Is it possible to access system audio using the Web Audio API, in order to visualize or apply an equalizer to it? It looks like it's possible to hook up system audio to an input device that Web Audio API can access (i.e. Web Audio API, get the output from the soundcard); however ideally I would like to be able to process all sound output without making any local configuration changes.
No, this isn't possible. The closest you could get is installing a loopback audio device on the user's systems, like Soundflower on OSX, and use that as the default audio output device. It's not possible without local config changes.
In web RTC, at least using Chrome, it's possible to make a screen capture in order to stream it. It's accomplished by using the experimental chromeMediaSource constraint.
I would like to do the same but capturing only audio in order to be able to send it to a webpage. I mean, I would like to capture not the micro but the audio 'played' by my machine in order to send it to a website.
Is there such constraint in web RTC? If the answer is 'yes' is there a Firefox equivalent?
You may want to look at the MediaStream Recording API, which has been implemented in Firefox Nightly and has an Intent to Implement for Chrome.
I've put a demo at simpl.info/mediarecorder.
With Web Audio, try RecorderJS: there's a demo at webaudiodemos.appspot.com/AudioRecorder.
This isn't another one of those "How can I record audio in the browser?" questions... I know that the HTML5 Stream API is around the corner and Flash can already access the user's microphone and camera. I'm simply wondering, as a Javascript developer with little knowledge of Flash, if anyone has developed a JS library that hooks into Flash's device capabilities for recording but sends the results back to javascript (presumably using ExternalInterface).
In other words... libraries like SoundManager2 utilize a Flash fallback for audio playback, but they don't seem to allow for recording. Has anyone written a JS library that uses an invisible Flash movie to allow audio recording?
This does most of what you're looking for:
https://code.google.com/p/wami-recorder/
It records audio and sends it to a server via an HTTP POST (avoiding the need for a Flash Media Server.) A JavaScript API is available via ExternalInterface.
I'm not sure why you'd want the audio bytes in JavaScript, but it would probably be easy to modify it to do that too.
Unfortunately, you can't really do Flash audio recording in browser only. The Flash audio interfaces are all designed (surprise surprise) to talk to a Flash media server (or Red5): there is no interface to store recorded audio data locally and pass the recorded audio data to Javascript.
Once you have Red5/FMS setup you can control the recording process from Javascript: you can start/stop/playback the audio stream to/from the server. However, for security reasons you have to have a flash movie that is a minimum of 216 x 138 (see http://blog.natebeck.net/2009/01/tip-of-the-day-tricks-of-the-mic-settings-panel/ for a writeup) otherwise the settings manager won't be shown: this prevents people hiding an audio recording flash widget on a page and eavesdropping.
So no, no invisible flash controlled from javascript.