I'm looking for a flash widget that allows users to record their audio and then send it to the server.
There are a few similar questions:
Record Audio and Upload as Wav or MP3 to server
They advocate using Red5 or flash media server.
Shouldn't it be possible to record locally on the user's client using the codecs that the user already has and then upload the resulting file to the server, rather than say, process the and record the stream on the server itself.
Thanks.
According to the the Capturing Sound Input Article if you are running Flash Player 10.1 you can save the microphone data to a ByteArray. The Capturing microphone sound data section gives the following example on how to do it:
var mic:Microphone = Microphone.getMicrophone();
mic.setSilenceLevel(0, DELAY_LENGTH);
mic.addEventListener(SampleDataEvent.SAMPLE_DATA, micSampleDataHandler);
function micSampleDataHandler(event:SampleDataEvent):void {
while(event.data.bytesAvailable) {
var sample:Number = event.data.readFloat();
soundBytes.writeFloat(sample);
}
}
Once you have the ByteArray you can of course do whatever you want with it.
Once you have the ByteArray you can pass it in with NetStream.appendBytes()
Related
I have looked at the documentation for Azure Media Services, but I could not find any docs or samples that talk about uploading a video from a browser using the webcam.
I am already using MediaRecorder api in the browser to record the webcam video and then upload... but I was wondering if there was a better alternative than 'MediaRecorder'?
Since Microsoft already provides a media player to play the videos, I was wondering if in a similar way, Microsoft maybe had a js library to do the video recording on the frontend and upload accordingly?
Edit: My intent is not to live stream from the browser, but to upload the file to AMS and play in the Azure media player at a later time.
EDIT:
Originally answered this thinking you meant live... But I see now you meant on-demand recorded files. I will leave the live section below.
On-Demand:
For the on-demand scenario, you should also be able to use the Media Recorder API to save the asset into local storage.
There are some good examples of doing that recording to storage out there, but the main API doc is here and has some guidance on how to save the output. The output could be WebM or MP4 depending on browser capabilities.
https://developer.mozilla.org/en-US/docs/Web/API/MediaStream_Recording_API
Once you have a local file, the steps to get that into AMS would be:
Create an Asset
Get the container name from the new Asset (we generate one for you, or you can pass the name of the container into the Asset create call)
Use the Azure Blob storage SDK to get a writeable SAS locator to upload the file into the container.
Useful example code:
Example of creating an Asset with a specific Blob container name
Previous Stack thread on uploading from browser to blob storage container
Azure Storage Javascript client samples
// If one file has been selected in the HTML file input element
var file = document.getElementById('fileinput').files[0];
var customBlockSize = file.size > 1024 * 1024 * 32 ? 1024 * 1024 * 4 : 1024 * 512;
blobService.singleBlobPutThresholdInBytes = customBlockSize;
var finishedOrError = false;
var speedSummary = blobService.createBlockBlobFromBrowserFile('mycontainer', file.name, file, {blockSize : customBlockSize}, function(error, result, response) {
finishedOrError = true;
if (error) {
// Upload blob failed
} else {
// Upload successfully
}
});
refreshProgress();
Disclaimer from me...
Have not tried the above scenario at all myself. Just finding the right resources for you to check. Hope that helps!
But, the below scenario I have done for live, and it does work..
Live Scenario:
It's tricky, because Media Services only supports RTMP ingest.
Media Recorder API just records to "slices" of webM or MP4 content in the browser. You have to then push those chunks of media someplace that is capable of sending that content into AMS as RTMP/S. Otherwise, no go.
There is a good example project from Mux up here called Wocket that demonstrates how you could go about building a solution that ingests content from the MediaRecorder API and send it to any RTMP server, but it has to go over a WebSocket connection to a middle tier Docker container that is running FFMPEG. The FFMPEG process receives the webM or MP4 chunks from the Media Recorder and forwards that onto Media Services.
https://github.com/MuxLabs/wocket
I built a demo recently that uses this in Azure, and it surprisingly works... but not really production ready solution - has lots of issues on iOS for example. Also, I had to move the Canvas drawing off the requestAnimationFrame timer and use worker-timers (node.js npm packager) to move the canvas drawing to a web worker. That way I could at least switch focus from different browser tabs or hide the browser without it stopping the webcam. Normally timers like setInterval or setTimeout in the browser will sleep or be throttled in modern browsers to save power and CPU.
Let me know if you want any more details after looking at the Wocket project.
Say I'm creating a youtube-type application, and want to create auto-generated captions. I have the video .mp4 file, and I want to generate a .vtt file for that. Is there anyway to do that with just the SpeechRecognition API and VTTCues? Like somehow I get the audio data from the mp4, and run that through the speech recognition api and it generates a transcript?
So far what I've seen is that the SpeechRecognition API can only transcript live microphone output. But is there a way to make it run through audio data?
If this helps, I'm using react in my frontend and node in my backend.
Not sure about accessing the Speech API directly, but with the speech SDK you can send binary audio data directly to the recognizer.
Have a look at
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-speech-to-text?tabs=windowsinstall&pivots=programming-language-csharp#recognize-from-file
and
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-speech-to-text?tabs=windowsinstall&pivots=programming-language-csharp#recognize-from-in-memory-stream
All you have to do is to create your audioConfig like
var audioConfig = AudioConfig.FromWavFileInput("PathToFile.wav");
or
var audioConfig = AudioConfig.FromStreamInput(audioInputStream);
instead of
var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
If the problem is with reading an mp4 - nAudio should be able to do that: https://github.com/naudio/NAudio
Here is my problem : I want to play a large video file (3.6Gb) stored in a S3 bucket, but it seems the file is too big and the page crash after 30sec of loading.
This is my code to play the video :
var video = document.getElementById("video");
const mediaSource = new MediaSource();
video.src = URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', sourceOpen, { once: true });
function sourceOpen() {
URL.revokeObjectURL(video.src);
const sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.f40028"');
fetch('URL_TO_VIDEO_IN_S3')
.then(response => response.arrayBuffer())
.then(data => {
// Append the data into the new sourceBuffer.
sourceBuffer.appendBuffer(data);
})
.catch(error => {
});
}
I saw that blob URL could be a solution but it didn't work well with my URL.
Take my answer with a grain of salt as I am no expert. However, I am working on something very similar at the moment.
I suspect your issue is that you're attempting to load the entire resource (video file) into the browser at once. An object URL for a file size that exceeds a gigabyte is extremely large.
What you need to do is use the readable stream from the body of your fetch request to process the video file chunk-by-chunk. So as long as you aren't confined to working in the safari browser, you should be to use both the Readable and Writeable Stream classes natively in the browser.
These two classes allow you to form what's called a pipe. In this case, you are "piping" data from the readable stream in your fetch request to a writable stream that you create which is then used as the underlying source of data for your media source extension and it's respective source buffers.
A stream pipe is very special in that it exhibits what's called backpressure. You should definitely look this term up, and read about what it means. In this case, it means the browser will not request more data once it has enough to meet its needs for video playback, the exact amount it can hold at once is specified by you the programmer through something called a "highwater mark" (you should also read about this).
This allows you to control when and how much data the browser is requesting from your (on going) fetch request.
NOTE: When you use .then(response => response.arrayBuffer()) you are telling the browser to wait for the entire resource to come back and then turn the response into an array buffer.
OPTION 1
Use CloudFront to create RTMP distribution to these resources.
It will distribute your video in streaming way.
Create an RTMP distribution to speed up distribution of your streaming media files using Adobe Flash Media Server's RTMP protocol.
Please note that HTML5 does not support RTMP format by default (without flash).
Check here for options
JWPlayer supports RTMP playback using flash. SO Question
---
OPTION 2
Use Elastic Transcoder to create HLS video (.m3u8 format). Again same JWPlayer can handle it in ease.
Also it's mostly supported in native HTML5. Check compatibility with H.264
Right now I have two methods of sending a WAV file to the server. A user can directly upload said file, or make a recording on their microphone. Once the files are sent, they are processed in nigh the same way. The file is sent to S3, and can later be played by clicking on some link (which plays the file via audio = new Audio('https://S3.url'); audio.play()
When dealing with a file from the microphone:
audio.play() seems to work. Everything in the audio object is identical (except for the URL itself), but the sound won't actually play through the speakers. On the other hand, for an uploaded file, the sound plays through the speakers.
When I visit the URLs directly—both of them open up the sound-player (in Chrome) or prompt a download for a WAV file (in Firefox). The sound-player appropriately plays both sounds, and the downloaded WAV files each contain their respective sound, which other programs can play
If I actually download the file with sound from the user's microphone instead of sending it directly to the server, then manually upload the WAV file, everything works as it should (as it would with any other uploaded WAV file).
In any scenario where the microphone-sound is uploaded somewhere, then downloaded, it is downloaded as a WAV file and plays accordingly. Anything which uses the re-uploaded WAV file works as intended.
Here's how I'm getting the sound from the user's microphone. First, I use WebAudioTrack to place a record button on my webpage. Once the user stops their recording, they hit the submit button which runs:
saveRecButton.addEventListener("click", function() {
save_recording(audioTrack.audioData.getChannelData(0))
});
Here, audioTrack.audioData is an AudioBuffer containing the recorded sound. getChannelData(0) is a Float32Array representing the sound. I send this array to the server (Django) via AJAX:
function save_recording(channelData){
var uploadFormData = new FormData();
uploadFormData.append('data', $('#some_field').val());
...
uploadFormData.append('audio', channelData);
$.ajax({
'method': 'POST',
'url': '/soundtests/save_recording/',
'data': uploadFormData,
'cache': false,
'contentType': false,
'processData': false,
success: function(dataReturned) {
if (dataReturned != "success") {
[- Do Some Stuff -]
}
});
}
Then, using wavio, a WAV file is written from an array:
import wavio
import tempfile
from numpy import array
def save_recording(request):
if request.is_ajax() and request.method == 'POST':
form = SoundForm(request.POST)
if form.is_valid():
with tempfile.NamedTemporaryFile() as sound_recording:
sound_array_string = request.POST.get('audio')
sound_array = array([float(x) for x in sound_array_string.split(',')])
wavio.write(sound_recording, sound_array, 48000, sampwidth=4)
sound_recording.seek(0)
s3_bucket.put_object(Key=some_key, Body=sound_recording, ContentType='audio/x-wav')
return HttpResponse('success')
Then, when the sound needs to be listened to:
In Python:
import boto3
session = boto3.Session(aws_access_key_id='key', aws_secret_access_key='s_key')
bucket = self.session.resource('s3').Bucket(name='bucket_name')
url = session.client('s3').generate_presigned_url('get_object', Params={'Bucket':bucket.name, Key:'appropriate_sound_key'})
Then, in JavaScript:
audio = new Audio('url_given_by_above_python')
audio.play()
The audio plays well if I upload a file, but doesn't play at all if I use the user's microphone. Is there something about WAV files I might be missing that's done when I upload the microphone sound to S3, then re-download it? I have no clue where to go next; everything between the two files seems identical. Here's a dump of two Audio objects with URLs from the user's mic. and another created from a file manually uploaded that was re-downloaded from that exact user-mic. file look exactly the same (except for the URL, which, upon visiting or downloading, plays both sounds).
There's got to be some difference here, but I have no idea what it is, and have been struggling with this for a few days now. :(
The sound file you're creating is 32-bit PCM, which is an arguably non-standard audio codec. Chrome supports it (source) but Firefox does not (source, bug).
Encode it as 16-bit PCM and it'll be universally acceptable.
EDIT: As mentioned in the comments, this is the parameter in question.
I wonder if it is possible to create a html5 video on the fly. Some of you may noticed the new webrtc and its behavior with the video tag.
navigator.webkitGetUserMedia('video', gotStream, noStream);
function gotStream(stream) {
video.src = webkitURL.createObjectURL(stream);
}
what exactly is that "stream" in gotStream(stream) what is that "interface" looks like so i can generate one of my own? May it be by computing things or by just receiving data from server to display the video. Secound how do i get the data out of this "stream"? So i can read it from one users webcam to send it to my server and let it pass through the receiving user. Binary data transmission is no topic of my question, i already have this working.
I just need the data from the "stream" from one user and reconstruct that "stream" on the target user who wanna see user ones webcam.
Any further information on "where to get these infos by my self" (API Docu sort of) would be also very helpful, cuz i cant find any.
I am aware of the PeerConnection stuff, so no need to mention it here. Cuz beside that webcam stuff i would love to pipe dynamically generated videos from my server to the client or make some sort of video transmitting over dynamic changeable bandwidth with ffmpeg etc. but for this i need to pipe that data to that video element
You may want to look at Whammy: http://antimatter15.com/wp/2012/08/whammy-a-real-time-javascript-webm-encoder/.
For now, you can periodically copy a video element screen to a canvas, then save that canvas to build a video. Whammy ties together webp images generated from a canvas into a webm file, taking advantage of the similarities of the webp (image) and webm (video) format.
You could generate other images and stitch them together in the same manner. Note that this does not support audio.