Zip pdf files from url - javascript

I'm trying to zip multiple pdf files from existing links using archiver npm module. I am able to zip multiple files from memory or local to my machine, but it doesn't work when I pass in urls in fs.createReadStream().
fs.createReadStream() tells me that all I need to do is pass in the path of the file that I want to stream the data from. I know I am doing something awfully wrong, can't get my head around what exactly it is.
var archiver = require('archiver');
var archive = archiver('zip');
res.set('Content-Type', 'application/zip');
res.set('Content-Disposition', 'attachment; filename=123.zip');
archive.pipe(res);
archive
.append(fs.createReadStream('https://URL-1-XXX.pdf'), { name: 'file1.pdf' });
.append(fs.createReadStream('https://URL-1-XXX.pdf'), { name: 'file2.pdf' });
archive.finalize(function(err, bytes) {
if (err)
throw err;
console.log('done: ', bytes);
});

fs.createReadStream is used to read files stored on disk, not make HTTPS requests. I'd advise you to read the API docs a little further and look into the https module.
Try this instead:
var request = require('request');
archive
.append(request.get('https://URL-1-XXX.pdf'), { name: 'file1.pdf' });
.append(request.get('https://URL-1-XXX.pdf'), { name: 'file2.pdf' });

Related

Extract a non ZIP file to files on disk?

I got a App File which is structured like a zip file.
Now I would like to extract all of the files in the app file.
I tried to convert the app to a zip file in the code (just copy and paste as zip file), but then it's a "SFX ZIP Archive", which most of the unzipper in node.js can't read.
For example AdmZip (error message):
rejected promise not handled within 1 second: Error: Invalid CEN
header (bad signature)
var AdmZip = require('adm-zip');
var admZip2 = new AdmZip("C:\\temp\\Test\\Microsoft_System.zip");
admZip2.extractAllTo("C:\\temp\\Test\\System", true)
So now i don't know how to deal with it, because I need to extract the files with all subfolder/subfiles to a specific folder on the computer.
How would you do this?
You can download the .app file here:
https://drive.google.com/file/d/1i7v_SsRwJdykhxu_rJzRCAOmam5dAt-9/view?usp=sharing
If you open it, you should see something like this:
Thanks for your help :)
EDIT:
I'm already using JSZip for resaving the zip file as a normal ZIP Archive. But this is a extra step which costs some time.
Maybe someone knows how to extract files to a path with JSZip :)
EDIT 2:
Just for you information: It's a VS Code Extension Project
EDIT 3:
I got something which worked for me.
For my solution I did it with Workers (Because parallel)
var zip = new JSZip();
zip.loadAsync(data).then(async function (contents) {
zip.remove('SymbolReference.json');
zip.remove('[Content_Types].xml');
zip.remove('MediaIdListing.xml');
zip.remove('navigation.xml');
zip.remove('NavxManifest.xml');
zip.remove('Translations');
zip.remove('layout');
zip.remove('ProfileSymbolReferences');
zip.remove('addin');
zip.remove('logo');
//workerdata.files = Object.keys(contents.files)
//so you loop through contents.files and foreach file you get the dirname
//then check if the dir exists (create if not)
//after this you create the file with its content
//you have to rewrite some code to fit your code, because this whole code are
//from 2 files, hope it helps someone :)
Object.keys(workerData.files.slice(workerData.startIndex, workerData.endIndex)).forEach(function (filename, index) {
workerData.zip.file(filename).async('nodebuffer').then(async function (content) {
var destPath = path.join(workerData.baseAppFolderApp, filename);
var dirname = path.dirname(destPath);
// Create Directory if is doesn't exists
await createOnNotExist(dirname);
files[index] = false;
fs.writeFile(destPath, content, async function (err) {
// This is code for my logic
files[index] = true;
if (!files.includes(false)) {
parentPort.postMessage(workerData);
};
});
});
});
jsZip is A library for creating, reading and editing .zip files with JavaScript, with a lovely and simple API.
link (https://www.npmjs.com/package/jszip)
example (extraction)
var JSZip = require('JSZip');
fs.readFile(filePath, function(err, data) {
if (!err) {
var zip = new JSZip();
zip.loadAsync(data).then(function(contents) {
Object.keys(contents.files).forEach(function(filename) {
zip.file(filename).async('nodebuffer').then(function(content) {
var dest = path + filename;
fs.writeFileSync(dest, content);
});
});
});
}
});
The file is a valid zip file appended to some sort of executable.
The easiest way is to extract it calling an unzipper such as unzipada.exe - free, open-source software available here. Pre-built Windows executables available in the Files section.

How to download a big file directly to the disk, without storing it in RAM of a server and browser?

I want to implement a big file downloading (approx. 10-1024 Mb) from the same server (without external cloud file storage, aka on-premises) where my app runs using Node.js and Express.js.
I figured out how to do that by converting the entire file into Blob, transferring it over the network, and then generating a download link with window.URL.createObjectURL(…) for the Blob. Such approach perfectly works as long as the files are small, otherwise it is impossible to keep the entire Blob in the RAM of neither server, nor client.
I've tried to implement several other approaches with File API and AJAX, but it looks like Chrome loads the entire file into RAM and only then dumps it to the disk. Again, it might be OK for small files, but for big ones it's not an option.
My last attempt was to send a basic Get-request:
const aTag = document.createElement("a");
aTag.href = `/downloadDocument?fileUUID=${fileName}`;
aTag.download = fileName;
aTag.click();
On the server-side:
app.mjs
app.get("/downloadDocument", async (req, res) => {
req.headers.range = "bytes=0";
const [urlPrefix, fileUUID] = req.url.split("/downloadDocument?fileUUID=");
const downloadResult = await StorageDriver.fileDownload(fileUUID, req, res);
});
StorageDriver.mjs
export const fileDownload = async function fileDownload(fileUUID, req, res) {
//e.g. C:\Users\User\Projects\POC\assets\wanted_file.pdf
const assetsPath = _resolveAbsoluteAssetsPath(fileUUID);
const options = {
dotfiles: "deny",
headers: {
"Content-Disposition": "form-data; name=\"files\"",
"Content-Type": "application/pdf",
"x-sent": true,
"x-timestamp": Date.now()
}
};
res.sendFile(assetsPath, options, (err) => {
if (err) {
console.log(err);
} else {
console.log("Sent");
}
});
};
When I click on the link, Chrome shows the file in Downloads but with a status Failed - No file. No file appears in the download destination.
My questions:
Why in case of sending a Get-request I get Failed - No file?
As far as I understand, res.sendFile can be a right choice for small files, but for big-ones it's better to use res.write, which can be split into chunks. Is it possible to use res.write with Get-request?
P.S. I've elaborated this question to make it more narrow and clear. Previously this question was focused on downloading a big file from Dropbox without storing it in the RAM, the answer can be found:
How to download a big file from Dropbox with Node.js?
Chrome can't show nice progress of downloading because the file is downloading on the background. And after downloading, a link to the file is created and "clicked" to force Chrome to show the dialog for the already downloaded file.
It can be done more easily. You need to create a GET request and let the browser download the file, without ajax.
app.get("/download", async (req, res, next) => {
const { fileName } = req.query;
const downloadResult = await StorageDriver.fileDownload(fileName);
res.set('Content-Type', 'application/pdf');
res.send(downloadResult.fileBinary);
});
function fileDownload(fileName) {
const a = document.createElement("a");
a.href = `/download?fileName=${fileName}`;
a.download = fileName;
a.click();
}

Creating an in-memory .zip with archiver, and then sending this file to the client with koa on a node server

I (as a node server with Koa framework), need to take a JSON blob, turn it into a file with extension .json, then stick that in a zip archive, then send the archive as a file attachment in response to a request from the client.
It seems the way to do this is use the Archiver tool. Best I can understand, the way to do this is to create an archive, append a json blog to it as a .json file (it automatically creates the file within the archive?), then "pipe" that .zip to the response object. The "piping" paradigm is where my understanding fails, mostly due to not getting what these docs are saying.
Archiver docs, as well as some stackoverflow answers, use language that to me means "stream the data to the client by piping (the zip file) to the HTTP response object. The Koa docs say that the ctx.body can be set to a stream directly, so here's what I tried:
Attempt 1
const archive = archiver.create('zip', {});
ctx.append('Content-Type', 'application/zip');
ctx.append('Content-Disposition', `attachment; filename=libraries.zip`);
ctx.response.body = archive.pipe(ctx.body);
archive.append(JSON.stringify(blob), { name: 'libraries.json'});
archive.finalize();
Logic: the response body should be set to a stream, and that stream should be the archiver stream (pointing at ctx.body).
Result: .zip file downloads on client-side, however the zip is malformed somehow (can't open).
Attempt 2
const archive = archiver.create('zip', {});
ctx.append('Content-Type', 'application/zip');
ctx.append('Content-Disposition', `attachment; filename=libraries.zip`);
archive.pipe(ctx.body);
archive.append(JSON.stringify(blob), { name: 'libraries.json'});
archive.finalize();
Logic: Setting a body to be a stream after, uh, "pointing a stream at it" does seem silly, so instead copy other stackoverflow examples.
Result: Same as attempt 1.
Attempt 3
Based on https://github.com/koajs/koa/issues/944
const archive = archiver.create('zip', {});
ctx.append('Content-Type', 'application/zip');
ctx.append('Content-Disposition', `attachment; filename=libraries.zip`);
ctx.body = ctx.request.pipe(archive);
archive.append(JSON.stringify(body), { name: 'libraries.json'});
archive.finalize();
Result: ctx.request.pipe is not a function.
I'm probably not reading this right, but everything online seems to indicate that doing archive.pipe(some sort of client-sent stream) "magically just works." That's a quote of the archive tool example file, "streaming magic" is the words they use.
How do I in-memory turn a JSON blob into a .json, then append that .json to a .zip that then is sent to the client and downloaded, and can then be successfully unzipped to see the.json ?
EDIT: If I console.log the ctx.body after archive.finalize(), it shows a ReadableStream, which seems right. However, it has a "path" property that worries me - its the index.html, which I had wondered about - in the "response preview" on the client side, I'm seeing a stringified version of our index.html. The file was still downloading a .zip, so I wasn't too concerned, but now I'm wondering if this is related.
EDIT2: Looking deeper into the response on the client-side, it appears that the data sent back is straight up our index.html, so now I'm very confused.
const passThrough = new PassThrough();
const archive = archiver.create('zip', {});
archive.pipe(passThrough);
archive.append(JSON.stringify(blob), { name: 'libraries.json'});
archive.finalize();
ctx.body = passThrough;
ctx.type = 'zip';
This should work fine for your use case, since archiver isn't actually a stream that we should pass to ctx.body I guess.
Yes, you can directly set ctx.body to the stream. Koa will take care of the piping. No need to manually pipe anything (unless you also want to pipe to a log, for instance).
const archive = archiver('zip');
ctx.type = 'application/zip';
ctx.response.attachment('test.zip');
ctx.body = archive;
archive.append(JSON.stringify(blob), { name: 'libraries.json' });
archive.finalize();

How to save a JSON file using Blob in JS without HTML

I just want to store my json data in a file in a particular directory using JS. I can not see the created file using the following code.
var jsonse = JSON.stringify(submitted);
var blob = new Blob([jsonse], {type: "application/json"});
var file = new File([blob], "" + workerID + ".json")
JS Documentation Link would also suffice.
Assuming you're not using a web browser which cannot write to your file system for, hopefully obvious (another question), security reasons.
You can redirect output from your script to a file.
node yourfile.js > output_file.json
Or you can use the fs module.
Writing files in Node.js
// note jsonse is the json blob
var fs = require('fs');
fs.writeFile("/tmp/test", jsonse, function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});

generating and serving static files with Meteor

I'm looking to create static text files based upon the content of a supplied object, which can then be downloaded by the user. Here's what I was planning on doing:
When the user hits 'export' the application calls a Meteor.method() which, in turn, parses and writes the file to the public directory using typical Node methods.
Once the file is created, in the callback from Meteor.method() I provide a link to the generated file. For example, 'public/userId/file.txt'. The user can then choose to download the file at that link.
I then use Meteor's Connect modele (which it uses internally) to route any requests to the above URL to the file itself. I could do some permissions checking based on the userId and the logged in state of the user.
The problem: When static files are generated in public, the web page automatically reloads each time. I thought that it might make more sense to use something like Express to generate a REST endpoint, which could deal with creating the files. But then I'm not sure how to deal with permissions if I don't have access to the Meteor session data.
Any ideas on the best strategy here?
In version 0.6.6.3 0.7.x - 1.3.x you can do the following:
To write
var fs = Npm.require('fs');
var filePath = process.env.PWD + '/.uploads_dir_on_server/' + fileName;
fs.writeFileSync(filePath, data, 'binary');
To serve
In vanilla meteor app
var fs = Npm.require('fs');
WebApp.connectHandlers.use(function(req, res, next) {
var re = /^\/uploads_url_prefix\/(.*)$/.exec(req.url);
if (re !== null) { // Only handle URLs that start with /uploads_url_prefix/*
var filePath = process.env.PWD + '/.uploads_dir_on_server/' + re[1];
var data = fs.readFileSync(filePath);
res.writeHead(200, {
'Content-Type': 'image'
});
res.write(data);
res.end();
} else { // Other urls will have default behaviors
next();
}
});
When using iron:router
This should be a server side route (ex: defined in a file in /server/ folder)
Edit (2016-May-9)
var fs = Npm.require('fs');
Router.route('uploads', {
name: 'uploads',
path: /^\/uploads_url_prefix\/(.*)$/,
where: 'server',
action: function() {
var filePath = process.env.PWD + '/.uploads_dir_on_server/' + this.params[0];
var data = fs.readFileSync(filePath);
this.response.writeHead(200, {
'Content-Type': 'image'
});
this.response.write(data);
this.response.end();
}
});
Outdated format:
Router.map(function() {
this.route('serverFile', {
...// same as object above
}
});
Notes
process.env.PWD will give you the project root
if you plan to put files inside your project
don't use the public or private meteor folders
use dot folders (eg. hidden folders ex: .uploads)
Not respecting these two will cause local meteor to restart on every upload, unless you run your meteor app with: meteor run --production
I've used this approach for a simple image upload & serve (based on dario's version)
Should you wish for more complex file management please consider CollectionFS
The symlink hack will no longer work in Meteor (from 0.6.5). Instead I suggest creating a package with similar code to the following:
packge.js
Package.describe({
summary: "Application file server."
});
Npm.depends({
connect: "2.7.10"
});
Package.on_use(function(api) {
api.use(['webapp', 'routepolicy'], 'server');
api.add_files([
'app-file-server.js',
], 'server');
});
app-file-server.js
var connect = Npm.require('connect');
RoutePolicy.declare('/my-uploaded-content', 'network');
// Listen to incoming http requests
WebApp.connectHandlers
.use('/my-uploaded-content', connect.static(process.env['APP_DYN_CONTENT_DIR']));
I was stuck at the exact same problem, where i need the users to upload files in contrast to your server generated files. I solved it sort of by creating an "uploads" folder as sibling to the "client public server" on the same folder level. and then i created a simbolic link to the '.meteor/local/build/static' folder like
ln -s ../../../../uploads .meteor/local/build/static/
but with nodejs filesystem api at server start time
Meteor.startup(function () {
var fs = Npm.require('fs');
fs.symlinkSync('../../../../uploads', '.meteor/local/build/static/uploads');
};
in your case you may have a folder like "generatedFiles" instead of my "uploads" folder
you need to do this every time the server starts up cuz these folders are generated every time the server starts up e.g. a file changes in your implementation.
Another option is to use a server side route to generate the content and send it to the user's browser for download. For example, the following will look up a user by ID and return it as JSON. The end user is prompted to save the response to a file with the name specified in the Content-Disposition header. Other headers, such as Expires, could be added to the response as well. If the user does not exist, a 404 is returned.
Router.route("userJson", {
where: "server",
path: "/user-json/:userId",
action: function() {
var user = Meteor.users.findOne({ _id: this.params.userId });
if (!user) {
this.response.writeHead(404);
this.response.end("User not found");
return;
}
this.response.writeHead(200, {
"Content-Type": "application/json",
"Content-Disposition": "attachment; filename=user-" + user._id + ".json"
});
this.response.end(JSON.stringify(user));
}
});
This method has one big downside, however. Server side routes do not provide an easy way to get the currently logged in user. See this issue on GitHub.

Categories