Moving Google STT from Cloud Functions to dedicated GAE - javascript

I'm using Cloud Functions to convert audio/mp4 from getUserMedia() placed in Storage bucket
To audio/x-flac format using ffmpeg for being able to transcribe it using Google STT
bucket
.file(file.name)
.download({ destination })
.then(() =>
ffmpeg(destination)
.setFfmpegPath(ffmpeg_static.path)
.audioChannels(1)
.audioFrequency(16000)
.format('flac')
.on('error', console.log)
.on('end', () =>
bucket
.upload(targetTempFilePath, { destination: targetStorageFilePath })
.then(() => {
fs.unlinkSync(destination);
fs.unlinkSync(targetTempFilePath);
});
)
.save(targetTempFilePath);
)
);
Workflow: client-side MP4 => Storage bucket trigger => STT => Firestore
It works great and I get clean FLAC files and STT works flawlessly in this combination!
But only IF
Input files are not larger than 1-2 Mb each (usually I have a series of 5-10 files coming in at once).
I'm aware of 10 Mb limit and now I want to let Cloud Functions handle image processing only and move all audio stuff to some dedicated GAE or GCE instance.
What's better to use: in this case GAE or GCP, dockerized or native, Python or Node, etc.
How exactly could the workflow be triggered on GCP instance after placing files on Storage?
Any thoughts or ideas would be greatly welcomed!

I would recommend you to use the Cloud Function as a Cloud Storage trigger.
In this way, you will be able to get the name of the file uploaded in your specific bucket.
You can check this documentation about Google Cloud Storage Triggers, in order to see some examples.
If you use Python, you can see the file name by using:
print('File: {}'.format(data['name']))
Once you got the name of the file, you can do the request to GAE in order to convert the audio.
I also found this post that explains how to call an URL hosted in Google App Engine, and I think it might be useful for you.
Hope this helps!

Related

Is it possible to download an html file from a given webpage using my local network/browser as if I downloaded it myself with javascript or nodejs?

I’m a bit new to javascriipt/nodejs and its packages. Is it possible to download a file using my local browser or network? Whenever I look up scraping html files or downloading them, it is always done through a separate package and their server doing a request to a given url. How do I make my own computer download a html file as if I did right click save as on a google chrome webpage without running into any server/security issues and errors with javascript?
Fetching a document over HTTP(S) in Node is definitely possible, although not as simple as some other languages. Here's the basic structure:
const https = require(`https`); // use http if it's an http url;
https.get(URLString, res => {
const buffers = [];
res.on(`data`, data => buffers.push(data));
res.on(`end`, ()=>{
const data = Buffer.concat(buffers);
/*
from here you can do what you want with the data. You can write it to a file
with fs, you can console.log it using data.toString(), etc.
*/
});
})
Edit: I think I missed the main question you had, give me a sec to add that.
Edit 2: If you're comfortable with doing the above, the way you access a website the same way as your browser is to open up the developer tools (F12 on Chrome) go to the network tab, find the request that the browser has made, and then using http(s).get(url, options, callback), set the exact same headers in the options that you see in your browser. Most of the time you won't need all of them, all you'll need is the authentication/session cookie.

How to Launch a PDF from a UWP (Universal Windows Platform) Web Application

I've converted an existing web application (HTML5, JS, CSS, etc.) into a Windows UWP app so that (hopefully) I can distribute it via the Windows Store to Surface Hubs so it can run offline. Everything is working fine, except PDF viewing. If I open a PDF in a new window, the Edge-based browser window simply crashes. If I open an IFRAME and load PDFJS into it, that also crashes. What I'd really like to do is just hand off the PDF to the operating system so the user can view it in whatever PDF viewer they have installed.
I've found some windows-specific Javascript APIs that seem promising, but I cannot get them to work. For example:
Windows.System.Launcher.launchUriAsync(
new Windows.Foundation.Uri(
"file:///"+
Windows.ApplicationModel.Package.current.installedLocation.path
.replace(/\//g,"/")+"/app/"+url)).then(function(success) {
if (!success) {
That generates a file:// URL that I can copy into Edge and it shows the PDF, so I know the URL stuff is right. However, in the application it does nothing.
If I pass an https:// URL into that launchUriAsync function, that works. So it appears that function just doesn't like file:// URLs.
I also tried this:
Windows.ApplicationModel.Package.current.installedLocation.getFileAsync(url).then(
function(file) { Windows.System.Launcher.launchFileAsync(file) })
That didn't work either. Again, no error. It just didn't do anything.
Any ideas of other things I could try?
-- Update --
See the accepted answer. Here is the code I ended up using. (Note that all my files are in a subfolder called "app"):
if (location.href.match(/^ms-appx:/)) {
url = url.replace(/\?.+/, "");
Windows.ApplicationModel.Package.current.installedLocation.getFileAsync(("app/" + url).replace(/\//g,"\\")).then(
function (file) {
var fn = performance.now()+url.replace(/^.+\./, ".");
file.copyAsync(Windows.Storage.ApplicationData.current.temporaryFolder,
fn).then(
function (file2) {
Windows.System.Launcher.launchFileAsync(file2)
})
});
return;
}
Turns out you have to turn the / into \ or it won't find the file. And copyAsync refuses to overwrite, so I just use performance.now to ensure I always use a new file name. (In my application, the source file names of the PDFs are auto-generated anyway.) If you wanted to keep the filename, you'd have to add a bunch of code to check whether it's already there, etc.
LaunchFileAsync is the right API to use here. You can't launch a file directly from the install directory because it is protected. You need to copy it first to a location that is accessible for the other app (e.g. your PDF viewer). Use StorageFile.CopyAsync to make a copy in the desired location.
Official SDK sample: https://github.com/Microsoft/Windows-universal-samples/tree/master/Samples/AssociationLaunching
I just thought I'd add a variation on this answer, which combines some details from above with this info about saving a blob as a file in a JavaScript app. My case is that I have a BLOB that represents the data for an epub file, and because of the UWP content security policy, it's not possible simply to force a click on a URL created from the BLOB (that "simple" method is explicitly blocked in UWP, even though it works in Edge). Here is the code that worked for me:
// Copy BLOB to downloads folder and launch from there in Edge
// First create an empty file in the folder
Windows.Storage.DownloadsFolder.createFileAsync(filename,
Windows.Storage.CreationCollisionOption.generateUniqueName).then(
function (file) {
// Open the returned dummy file in order to copy the data to it
file.openAsync(Windows.Storage.FileAccessMode.readWrite).then(function (output) {
// Get the InputStream stream from the blob object
var input = blob.msDetachStream();
// Copy the stream from the blob to the File stream
Windows.Storage.Streams.RandomAccessStream.copyAsync(input, output).then(
function () {
output.flushAsync().done(function () {
input.close();
output.close();
Windows.System.Launcher.launchFileAsync(file);
});
});
});
});
Note that CreationCollisionOption.generateUniqueName handles the file renaming automatically, so I don't need to fiddle with performance.now() as in the answer above.
Just to add that one of the things that's so difficult about UWP app development, especially in JavaScript, is how hard it is to find coherent information. It took me hours and hours to put the above together from snippets and post replies, following false paths and incomplete MS documentation.
You will want to use the PDF APIs https://github.com/Microsoft/Windows-universal-samples/tree/master/Samples/PdfDocument/js
https://github.com/Microsoft/Windows-universal-samples/blob/master/Samples/PdfDocument/js/js/scenario1-render.js
Are you simply just trying to render a PDF file?

Get file(word, excel, ppt) metadata information in nodejs

I would like to get information on file, at least only the information on number of pages from nodejs in client side (react). I was able to get the same for PDF files using PDFJs. Could someone point as to how it can be done for other file types like word, xls and ppt ? If there are external APIs which would provide this service, pointing that would be helpful too.
For getting page count in docx and pdf files you can use https://www.npmjs.com/package/docx-pdf-pagecount
const getPageCount = require('docx-pdf-pagecount');
getPageCount('E:/sample/document/aa/test.docx')
.then(pages => {
console.log(pages);
})
.catch((err) => {
console.log(err);
});
getPageCount('E:/sample/document/vb.pdf')
.then(pages => {
console.log(pages);
})
.catch((err) => {
console.log(err);
});
You can use XLSX to parse spreadsheet-like files. XLSX can parse the files and return all the info of them.
But you can only retrieve the meta info until you use XLSX to parse those files. That means, no matter what, you have to parse them. If your files are big, it would be a performance issue for client browsers if you do it on client slide.
Update:
A hint, you can find some tools to detect the file type of the files, and deliver them to the corresponding parser the get the meta info.
For now, there is no such library implemented natively in JavaScript. If you are fine with some other none-pure node modules.
Like textract, see how it works.

Cordova / Javascript Copy Image from URL to Store in Local Storage

I'm building offline capability for my mobile app. The app basically have a function to sync content from server via API. The app then store the API response into local storage ( Using localStorage.setItem('key',response) ).
The app will then get the images URL (from the response) and attempt to store these images with cordova-file-plugin.
I have sorted out most of the part, but i have no clue after much googling on how to copy and image (URL) and save it to cordova.file.dataDirectory.
If anyone have any pointer will be really helpful.
Thanks.
Instead build the function from scratch, try some plugins below
https://github.com/chrisben/imgcache.js/
https://github.com/markmarijnissen/cordova-file-cache
I would advice you to use ngCordova. It is basically a wrapper about some of the official cordova plugins.
Compared to cordova-file-plugin it is much more convinient to use it. Example:
$cordovaFile.createFile(cordova.file.dataDirectory, "new_file.txt", true)
.then(function (success) {
// success
}, function (error) {
// error
});
In case that your unclear how to download files. Have a look at the ngCordova File Transfer Plugin

HTML5 FileSystem API

I have created file 'log.txt' by fileSystem API
function initFS(grantedBytes) {
window.requestFileSystem(window.PERSISTENT, grantedBytes, function (filesystem) {
fs = filesystem;
fs.root.getFile('log.txt', { create: true, exclusive: true }, function (fileEntry) {
// fileEntry.isFile === true
// fileEntry.name == 'log.txt'
// fileEntry.fullPath == '/log.txt'
console.log(fileEntry.fullPath);
}, errorHandler);
}, errorHandler);
}
initFS(1024*1024);
And do not fully understand its structure. Is There any way to explore this file
for example from Windows Explorer and see it in file system?
There is an even simpler way. On chrome, visit these urls.
For http, it's "filesystem:http://"+location.host+"/persistent/".
For https, it's "filesystem:https://"+location.host+"/persistent/".
Sort of, the File-system API doesn't encrypt the data being stored locally. It does however change the file naming conventions up. So you may have named it log.txt but if you poke around where the file-system API stores files, you'd probably find it under some arbitrary randomly generated file name like "00010" or in a random directory like "24/00123".
Anyway, you can open each file up in a text editor - if your file had text written to it you would be able to view it as such. Or if you wrote JSON to the file-system API it would be in human-readable string format when you opened in the text editor.
On Windows 7, with Chrome it's found here:
C:\Users\{user}\AppData\Local\Google\Chrome\User Data\Default\File System\
If you want to find out where it is stored via Chrome on other OS please see this post
Log files that an end-user or maintainer might want to see should be stored someplace in the normal file system. While the checked answer suggests how to find them when the HTML5 API is used, this location is subject to change and is troublesome to find.
A better solution is to have the user choose the directory for log files (and perhaps other files) when the app is installed, using chrome.fileSystem.chooseEntry, and then retain that entry and save it in local storage so it can be reused on subsequent launches.

Categories