I am developing a Firefox addon. I need to save a bunch of data URI images to the disk. How do I approach to this?
I have browsed through the file I/O snippets on MDN, but the snippets don't help me much.
There are async and sync methods.I would like to use async method but how can I write a binary file using async method
Components.utils.import("resource://gre/modules/NetUtil.jsm");
Components.utils.import("resource://gre/modules/FileUtils.jsm");
// file is nsIFile
var file = FileUtils.getFile("Desk", ["test.png"]);
// You can also optionally pass a flags parameter here. It defaults to
// FileUtils.MODE_WRONLY | FileUtils.MODE_CREATE | FileUtils.MODE_TRUNCATE;
var ostream = FileUtils.openSafeFileOutputStream(file);
//base64 image that needs to be saved
image ="iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==";
// How can I create an inputstream from the image data URI?
var inputstream = createInputstream(image);
// The last argument (the callback) is optional.
NetUtil.asyncCopy(inputstream , ostream, function(status) {
if (!Components.isSuccessCode(status)) {
// Handle error!
return;
}
// Data has been written to the file.
});
It sounds like you'd like to write not the data URI but the binary data it "contains", so I'll answer that.
First, lets assume we got some actual data URI, (if not, adding data:application/octet-stream;base64, isn't too hard ;)
// btoa("helloworld") as a placeholder ;)
var imageDataURI = "data:application/octet-stream;base64,aGVsbG93b3JsZA==";
Option 1 - Using OS.File
OS.File has the benefit that it is truly async. On the other hand, NetUtil is only mostly async, in that there will be stat calls on the main thread and the file will be opened and potentially closed on the main thread as well (which can lead to buffer flushes and hence block the main thread while the flush is happening).
After constructing a path (with some constants help), OS.File.writeAtomic is suited for the job.
Components.utils.import("resource://gre/modules/osfile.jsm");
var file = OS.Path.join(OS.Constants.Path.desktopDir, "test.png");
var str = imageDataURI.replace(/^.*?;base64,/, "");
// Decode to a byte string
str = atob(str);
// Decode to an Uint8Array, because OS.File.writeAtomic expects an ArrayBuffer(View).
var data = new Uint8Array(str.length);
for (var i = 0, e = str.length; i < e; ++i) {
data[i] = str.charCodeAt(i);
}
// To support Firefox 24 and earlier, you'll need to provide a tmpPath. See MDN.
// There is in my opinion no need to support these, as they are end-of-life and
// contain known security issues. Let's not encourage users. ;)
var promised = OS.File.writeAtomic(file, data);
promised.then(
function() {
// Success!
},
function(ex) {
// Failed. Error information in ex
}
);
Option 2 - Using NetUtil
NetUtil has some drawbacks in that is is not fully async, as already stated above.
We can take a shortcut in that we can use NetUtil.asyncFetch to directly fetch the URL, which gives us a stream we can pass along to .asyncCopy.
Components.utils.import("resource://gre/modules/NetUtil.jsm");
Components.utils.import("resource://gre/modules/FileUtils.jsm");
// file is nsIFile
var file = FileUtils.getFile("Desk", ["test.png"]);
NetUtil.asyncFetch(imageDataURI, function(inputstream, status) {
if (!inputstream || !Components.isSuccessCode(status)) {
// Failed to read data URI.
// Handle error!
return;
}
// You can also optionally pass a flags parameter here. It defaults to
// FileUtils.MODE_WRONLY | FileUtils.MODE_CREATE | FileUtils.MODE_TRUNCATE;
var ostream = FileUtils.openSafeFileOutputStream(file);
// The last argument (the callback) is optional.
NetUtil.asyncCopy(inputstream , ostream, function(status) {
if (!Components.isSuccessCode(status)) {
// Handle error!
return;
}
// Data has been written to the file.
});
});
Related
I have a really huge json object (created with a JavaScript parser called espree, contains an array of objects). I want to write it to a .json file, but it fails every time with memory allocation problems (My heap size is 22 Gb).
As far as I understand, the buffer gets overloaded while the data is not being written into file.
If I use synchronous file operations only, the output gets written into the file, but the running time of my application exploads.
Solutions I have tried and failed (tried to serialize the whole object, then tried to serialize the items of the array):
JSON.stringify
JSONStream
big-json (which should be serialize the object as a stream, but the buffer still gets overloaded..)
watching for drain events
Here is the current code:
const bjson= require('big-json');
function save(result) {
let outputStream = fs.createWriteStream(/*path*/);
const stringifyStream = bjson.createStringifyStream({
body: result
});
function write(d) {
let result = outputStream.write(d);
if (!result) {
outputStream.once('drain', write);
}
}
stringifyStream.on('data', function (chunk) {
writeData(chunk);
});
stringifyStream.on('end', function () {
outputStream.end();
});
}
let results = [/*results as an array, containing lots of json objects*/];
for (let i = 0; i < results.length; i++){
save(result[i]);
}
Performance issue came from JSON transformation to string. I have the same issue and I solved that by storing data with msgpack format.
Has explained here, I installed msgpack-lite in my projet with npm :
npm install msgpack-lite
I code that for storing my JSON object :
var fs = require("fs");
var msgpack = require("msgpack-lite");
var writeStream = fs.createWriteStream("file.msp");
var encodeStream = msgpack.createEncodeStream();
encodeStream.pipe(writeStream);
// send multiple objects to stream
encodeStream.write(myBigObject);
// call this once you're done writing to the stream.
encodeStream.end();
And that for reading my file and restore my object. I don't know why it doesn't works with Streams :
var fs = require("fs");
var msgpack = require("msgpack-lite");
var buffer = fs.readFileSync("file.msp");
var myBigObject = msgpack.decode(buffer);
I have yet to find a answer to my problem with the examples from questions others have asked on this.
I wrote a little web scraper that stores data to 1 array and would like to write it (the arrays) to a file. I'm having trouble setting things up correctly.
I am using nodejs. Could someone write a sample that takes an array content then writes to a file. please break it down to basic, I am still new at programming.
Thanks the code is below
var content = [];
var request = require('request');
var cheerio = require('cheerio');
var URL = 'http://www.amazon.com';
request(URL, function(error, response, html){
if (error){
consol.log('Error:', error);
}
if (response.statusCode !== 200) {
console.log('Invalid Status Code Returned:', response.statusCode);
}
//console.log(html);
var $ = cheerio.load(html);
$('td').each(function (i, element) {
var a = $(this).next();
var trimmed_a = a.text();
trimmed_a = trimmed_a.trim();
var str = trimmed_a.replace(/\s\s+/g,"");
var newStr = str.trim();
content.push(newStr);
});
console.log(content);
})
Simplest possible answer:
var fs = require('fs');
var arr = ['cat','dog','bird'];
var filename = 'output.txt';
var str = JSON.stringify(arr, null, 4);
fs.writeFile(filename, str, function(err){
if(err) {
console.log(err)
} else {
console.log('File written!');
}
});
Here, arr, would be your array of data, that your casting to a string because fs.writeFile expects a string. I used the null,4 additional arguments to make it pretty print so you can see it with a four space indent.
Hope this helps!
It's not possible to store a real array/object in a file – its contents are bytes, however you can store a stringified format of this object, then parse this same format, using JSON for example (I think this ref equivals to Node.js):
json_format = JSON.stringify(content)
var json_format
So, if you want to read the array in the file after getting its contents
JSON.parse(json_format)
Remind, in JSON there is no kind of function declarations, all primitive values are supported, except NaN, Infinity, undefined (that is not a value), etc., and still include special number syntaxes (exponent (+ | -), ...): JSON. All values that JSON doesn't support, JSON.stringify treats them as null. I'm not sure how it exactly works between different platforms, though (I only use browser JS).
Now, to save/write the file we currently have
asynchronous fs.writeFile and synchronous fs.writeFileSync. I don't know much about Node.js, though. When using these methods you must include the File System in somewhere, normally so (File System is in a module):
fs = require('fs'); var fs
I need to monitor a file for changes. Due to a large amount of new entries to this file I would need to 'monitor' this file. I would need to get the new inserted content to this file to be able to parse this content.
I found this code:
fs.watchFile('var/log/query.log', function() {
console.log('File Changed ...');
//how to get the new line which is now inserted?
});
Here is an example of how I used fs.watchFile to monitor a log file for a game called Hearthstone and pick up new log entries to monitor game events as they happened while playing. https://github.com/chevex-archived/hearthstone-log-watcher/blob/master/index.js
var fs = require('fs');
var options = {
logFile: '~/Library/Preferences/Blizzard/Hearthstone/log.config',
endOfLineChar: require('os').EOL
};
// Obtain the initial size of the log file before we begin watching it.
var fileSize = fs.statSync(options.logFile).size;
fs.watchFile(options.logFile, function (current, previous) {
// Check if file modified time is less than last time.
// If so, nothing changed so don't bother parsing.
if (current.mtime <= previous.mtime) { return; }
// We're only going to read the portion of the file that
// we have not read so far. Obtain new file size.
var newFileSize = fs.statSync(options.logFile).size;
// Calculate size difference.
var sizeDiff = newFileSize - fileSize;
// If less than zero then Hearthstone truncated its log file
// since we last read it in order to save space.
// Set fileSize to zero and set the size difference to the current
// size of the file.
if (sizeDiff < 0) {
fileSize = 0;
sizeDiff = newFileSize;
}
// Create a buffer to hold only the data we intend to read.
var buffer = new Buffer(sizeDiff);
// Obtain reference to the file's descriptor.
var fileDescriptor = fs.openSync(options.logFile, 'r');
// Synchronously read from the file starting from where we read
// to last time and store data in our buffer.
fs.readSync(fileDescriptor, buffer, 0, sizeDiff, fileSize);
fs.closeSync(fileDescriptor); // close the file
// Set old file size to the new size for next read.
fileSize = newFileSize;
// Parse the line(s) in the buffer.
parseBuffer(buffer);
});
function stop () {
fs.unwatchFile(options.logFile);
};
function parseBuffer (buffer) {
// Iterate over each line in the buffer.
buffer.toString().split(options.endOfLineChar).forEach(function (line) {
// Do stuff with the line :)
});
};
It first calculates the initial size of the file because in this log watcher module I only want to read new data as it's being written by the game. I don't care about existing data. It then starts watching the file for changes. When the change handler fires we check if the modified time is really newer because some other changes about the file can trigger the handler when no data we care about actually changed. We wanted this watcher to be as performant as we could.
We then read the new size of the file and calculate the difference from the last time. This tells us exactly how much data to read from the file to get only the newly written data. Then we store the data in a buffer and parse it as a string. Just split the string by newline characters. Using core module os to get os.EOL will give you the correct line ending character for the operating system you are running on (windows line ending character is different from linux/unix).
Now you have an array of lines written to the file :)
On bash you would do something like that with tail --follow.
There is also a package tail availible.
you can watch on a file, and get new lines with an event:
const Tail = require('tail').Tail;
var tail = new Tail("var/log/query.log");
tail.watch()
tail.on("line", data => {
console.log(data);
});
I'm trying to implement a routine for Node.js that would allow one to open a file, that is being appended to by some other process at this very time, and then return chunks of data immediately as they are appended to file. It can be thought as similar to tail -f UNIX command, however acting immediately as chunks are available, instead of polling for changes over time. Alternatively, one can think of it as of working with a file as you do with socket — expecting on('data') to trigger from time to time until a file is closed explicitly.
In C land, if I were to implement this, I would just open the file, feed its file descriptor to select() (or any alternative function with similar designation), and then just read chunks as file descriptor is marked "readable". So, when there is nothing to be read, it won't be readable, and when something is appended to file, it's readable again.
I somewhat expected this kind of behavior for following code sample in Javascript:
function readThatFile(filename) {
const stream = fs.createReadStream(filename, {
flags: 'r',
encoding: 'utf8',
autoClose: false // I thought this would prevent file closing on EOF too
});
stream.on('error', function(err) {
// handle error
});
stream.on('open', function(fd) {
// save fd, so I can close it later
});
stream.on('data', function(chunk) {
// process chunk
// fs.close() if I no longer need this file
});
}
However, this code sample just bails out when EOF is encountered, so I can't wait for new chunk to arrive. Of course, I could reimplement this using fs.open and fs.read, but that somewhat defeats Node.js purpose. Alternatively, I could fs.watch() file for changes, but it won't work over network, and I don't like an idea of reopening file all the time instead of just keeping it open.
I've tried to do this:
const fd = fs.openSync(filename, 'r'); // sync for readability' sake
const stream = net.Socket({ fd: fd, readable: true, writable: false });
But had no luck — net.Socket isn't happy and throws TypeError: Unsupported fd type: FILE.
So, any solutions?
UPD: this isn't possible, my answer explains why.
I haven't looked into the internals of the read streams for files, but it's possible that they don't support waiting for a file to have more data written to it. However, the fs package definitely supports this with its most basic functionality.
To explain how tailing would work, I've written a somewhat hacky tail function which will read an entire file and invoke a callback for every line (separated by \n only) and then wait for the file to have more lines written to it. Note that a more efficient way of doing this would be to have a fixed size line buffer and just shuffle bytes into it (with a special case for extremely long lines), rather than modifying JavaScript strings.
var fs = require('fs');
function tail(path, callback) {
var descriptor, bytes = 0, buffer = new Buffer(256), line = '';
function parse(err, bytesRead, buffer) {
if (err) {
callback(err, null);
return;
}
// Keep track of the bytes we have consumed already.
bytes += bytesRead;
// Combine the buffered line with the new string data.
line += buffer.toString('utf-8', 0, bytesRead);
var i = 0, j;
while ((j = line.indexOf('\n', i)) != -1) {
// Callback with a single line at a time.
callback(null, line.substring(i, j));
// Skip the newline character.
i = j + 1;
}
// Only keep the unparsed string contents for next iteration.
line = line.substr(i);
// Keep reading in the next tick (avoids CPU hogging).
process.nextTick(read);
}
function read() {
var stat = fs.fstatSync(descriptor);
if (stat.size <= bytes) {
// We're currently at the end of the file. Check again in 500 ms.
setTimeout(read, 500);
return;
}
fs.read(descriptor, buffer, 0, buffer.length, bytes, parse);
}
fs.open(path, 'r', function (err, fd) {
if (err) {
callback(err, null);
} else {
descriptor = fd;
read();
}
});
return {close: function close(callback) {
fs.close(descriptor, callback);
}};
}
// This will tail the system log on a Mac.
var t = tail('/var/log/system.log', function (err, line) {
console.log(err, line);
});
// Unceremoniously close the file handle after one minute.
setTimeout(t.close, 60000);
All that said, you should also try to leverage the NPM community. With some searching, I found the tail-stream package which might do what you want, with streams.
Previous answers have mentioned tail-stream's approach which uses fs.watch, fs.read and fs.stat together to create the effect of streaming the contents of the file. You can see that code in action here.
Another, perhaps hackier, approach might be to just use tail by spawning a child process with it. This of course comes with the limitation that tail must exist on the target platform, but one of node's strengths is using it to do asynchronous systems development via spawn and even on windows, you can execute node in an alternate shell like msysgit or cygwin to get access to the tail utility.
The code for this:
var spawn = require('child_process').spawn;
var child = spawn('tail',
['-f', 'my.log']);
child.stdout.on('data',
function (data) {
console.log('tail output: ' + data);
}
);
child.stderr.on('data',
function (data) {
console.log('err data: ' + data);
}
);
So, it seems people are still looking for an answer to this question for five years already, and there is yet no answer on topic.
In short: you can't. Not in Node.js particularly, you can't at all.
Long answer: there are few reasons for this.
First, POSIX standard clarifies select() behavior in this regard as follows:
File descriptors associated with regular files shall always select true for ready to read, ready to write, and error conditions.
So, select() can't help with detecting a write beyond the file end.
With poll() it's similar:
Regular files shall always poll TRUE for reading and writing.
I can't tell for sure with epoll(), since it's not standartized and you have to read quite lengthy implementation, but I would assume it's similar.
Since libuv, which is in core of Node.js implementation, uses read(), pread() and preadv() in its uv__fs_read(), neither of which would block when invoked at the end of file, it would always return empty buffer when EOF is encountered. So, no luck here too.
So, summarizing, if such functionality is desired, something must be wrong with your design, and you should revise it.
What you're trying to do is a FIFO file (acronym for First In First Out), which as you said works like a socket.
There's a node.js module that allows you to work with fifo files.
I don't know what do you want that for, but there are better ways to work with sockets on node.js. Try socket.io instead.
You could also have a look at this previous question:
Reading a file in real-time using Node.js
Update 1
I'm not familiar with any module that would do what you want with a regular file, instead of with a socket type one. But as you said, you could use tail -f to do the trick:
// filename must exist at the time of running the script
var filename = 'somefile.txt';
var spawn = require('child_process').spawn;
var tail = spawn('tail', ['-f', filename]);
tail.stdout.on('data', function (data) {
data = data.toString().replace(/^[\s]+/i,'').replace(/[\s]+$/i,'');
console.log(data);
});
Then from the command line try echo someline > somefile.txt and watch at the console.
You might also would like to have a look at this: https://github.com/layerssss/node-tailer
I'm developing a web app that can upload large file into the Azure Blob Storage.
As a backend, I am using Windows Azure Mobile Services (the web app will generate contents for mobile devices) in nodeJS.
My client can successfully send chunks of data to the backend, everything looks fine but, at the end, the uploaded file is empty. The data upload has been prepared by following this tutorial: it works perfectly when the file is small enough to be uploaded in a single requests. The process fails when the file needs to be broken in chunks. It uses the ReadableStreamBuffer from the tutorial.
Can somebody help me?
Here the code:
Back-end : createBlobBlockFromStream
[...]
//Get references
var azure = require('azure');
var qs = require('querystring');
var appSettings = require('mobileservice-config').appSettings;
var accountName = appSettings.STORAGE_NAME;
var accountKey = appSettings.STORAGE_KEY;
var host = accountName + '.blob.core.windows.net';
var container = "zips";
//console.log(request.body);
var blobName = request.body.file;
var blobExt = request.body.ext;
var blockId = request.body.blockId;
var data = new Buffer(request.body.data, "base64");
var stream = new ReadableStreamBuffer(data);
var streamLen = stream.size();
var blobFull = blobName+"."+blobExt;
console.log("BlobFull: "+blobFull+"; id: "+blockId+"; len: "+streamLen+"; "+stream);
var blobService = azure.createBlobService(accountName, accountKey, host);
//console.log("blockId: "+blockId+"; container: "+container+";\nblobFull: "+blobFull+"streamLen: "+streamLen);
blobService.createBlobBlockFromStream(blockId, container, blobFull, stream, streamLen,
function(error, response){
if(error){
request.respond(statusCodes.INTERNAL_SERVER_ERROR, error);
} else {
request.respond(statusCodes.OK, {message : "block created"});
}
});
[...]
Back-end: commitBlobBlock
[...]
var azure = require('azure');
var qs = require('querystring');
var appSettings = require('mobileservice-config').appSettings;
var accountName = appSettings.STORAGE_NAME;
var accountKey = appSettings.STORAGE_KEY;
var host = accountName + '.blob.core.windows.net';
var container = "zips";
var blobName = request.body.file;
var blobExt = request.body.ext;
var blobFull = blobName+"."+blobExt;
var blockIdList = request.body.blockList;
console.log("blobFull: "+blobFull+"; blockIdList: "+JSON.stringify(blockIdList));
var blobService = azure.createBlobService(accountName, accountKey, host);
blobService.commitBlobBlocks(container, blobFull, blockIdList, function(error, result){
if(error){
request.respond(statusCodes.INTERNAL_SERVER_ERROR, error);
} else {
request.respond(statusCodes.OK, result);
blobService.listBlobBlocks(container, blobFull)
}
});
[...]
The second method returns the correct list of blockId, so I think that the second part of the process works fine. I think that it is the first method that fails to write the data inside the block, as if it creates some empty blocks.
In the client-side, I read the file as an ArrayBuffer, by using the FileReader JS API.
Then I convert it in a Base4 encoded string by using the following code. This approach works perfectly if I create the blob in a single call, good for small files.
[...]
//data contains the ArrayBuffer read by the FileReader API
var requestData = new Uint8Array(data);
var binary = "";
for (var i = 0; i < requestData.length; i++) {
binary += String.fromCharCode( requestData[ i ] );
}
[...]
Any idea?
Thank you,
Ric
Which version of the Azure Storage Node.js SDK are you using? It looks like you might be using an older version; if so I would recommend upgrading to the latest (0.3.0 as of this writing). We’ve improved many areas with the new library, including blob upload; you might be hitting a bug that has already been fixed. Note that there may be breaking changes between versions.
Download the latest Node.js Module (code is also on Github)
https://www.npmjs.org/package/azure-storage
Read our blog post: Microsoft Azure Storage Client Module for Node.js v. 0.2.0 http://blogs.msdn.com/b/windowsazurestorage/archive/2014/06/26/microsoft-azure-storage-client-module-for-node-js-v-0-2-0.aspx
If that’s not the issue, can you check a Fiddler trace (or equivalent) to see if the raw data blocks are being sent to the service?
Not too sure if your still suffering from this problem but i was experiencing the exact same thing and came across this looking for a solution. Well i found one and though id share.
My problem was not with how i push the block but in how i committed it. My little proxy server had no knowledge of prior commits, it just pushes the data its sent and commits it. Trouble is i wasn't providing the commit message with the previously committed blocks so it was overwriting them with the current commit each time.
So my solution:
var opts = {
UncommittedBlocks: [IdOfJustCommitedBlock],
CommittedBlocks: [IdsOfPreviouslyCommittedBlocks]
}
blobService.commitBlobBlocks('containerName', 'blobName', opts, function(e, r){});
For me the bit that broke everything was the format of the opts object. I wasn't providing an array of previously committed block names. Its also worth noting that i had to base64 decode the existing block names as:
blobService.listBlobBlocks('containerName', 'fileName', 'type IE committed', fn)
Returns an object for each block with the name being base64 encoded.
Just for completeness here's how i push my blocks, req is from the express route:
var blobId = blobService.getBlockId('blobName', 'lengthOfPreviouslyCommittedArray + 1 as Int');
var length = req.headers['content-length'];
blobService.createBlobBlockFromStream(blobId, 'containerName', 'blobName', req, length, fn);
Also with the upload i had a strange issue where the content-length header caused it to break and so had to delete it from the req.headers object.
Hope this helps and is detailed enough.