I'm using exec from child_process.
The function runs fine but after 4-5minutes, it just stops, without any errors reported, but the script should run for at least 24hours...
Here is the code :
import { exec } from 'child_process';
function searchDirectory(dirPath) {
let lineBuffer = '';
const cmd = `find ${dirPath} -type f -name "*.txt" | pv -l -L 10 -q`;
const findData = exec(cmd);
findData.on('error', err => log.error(err));
findData.stdout.on('data', data => {
lineBuffer += data;
let lines = lineBuffer.split('\n');
for (var i = 0; i < lines.length - 1; i++) {
let filepath = lines[i];
processfile(filepath);
}
lineBuffer = lines[lines.length - 1];
});
findData.stdout.on('end', () => console.log('finished finding...'));
}
The pv command slows down the output, I need this since the path where I'm finding is over the network and pretty slow (60mb/s).
When I run the command directly in the terminal it works fine (I didn't wait 24hours but I let it for half hour and it was still running).
The processfile function actually makes an async call with axios to send some data to a server :
let data = readFileSync(file);
...
axios.post(API_URL, { obj: data }, { proxy: false })
.then(res => {
log.info('Successfully saved object : ' + res.data._id);
})
.catch(err => {
log.error(err.response ? err.response.data : err);
});
What could cause the script to stop? Any ideas?
Thanks
I found the issue, using exec is not recommended for huge outputs since it's using a limited size buffer. Use spawn instead :
The most significant difference between child_process.spawn and
child_process.exec is in what they return - spawn returns a stream and
exec returns a buffer.
child_process.spawn returns an object with stdout and stderr streams.
You can tap on the stdout stream to read data that the child process
sends back to Node. stdout being a stream has the "data", "end", and
other events that streams have. spawn is best used to when you want
the child process to return a large amount of data to Node - image
processing, reading binary data etc.
child_process.exec returns the whole buffer output from the child
process. By default the buffer size is set at 200k. If the child
process returns anything more than that, you program will crash with
the error message "Error: maxBuffer exceeded". You can fix that
problem by setting a bigger buffer size in the exec options. But you
should not do it because exec is not meant for processes that return
HUGE buffers to Node. You should use spawn for that. So what do you
use exec for? Use it to run programs that return result statuses,
instead of data.
from : https://www.hacksparrow.com/difference-between-spawn-and-exec-of-node-js-child_process.html
Related
I tried to write a program with highland.js to download several files, unzip them and parse into objects, then merge object streams into one stream by flatMap and print out.
function download(url) {
return _(request(url))
.through(zlib.createGunzip())
.errors((err) => console.log('Error in gunzip', err))
.through(toObjParser)
.errors((err) => console.log('Error in OsmToObj', err));
}
const urlList = ['url_1', 'url_2', 'url_3'];
_(urlList)
.flatMap(download)
.each(console.log);
When all URLs are valid, it works fine. If a URL is invalid there is no file downloaded, then gunzip reports error. I suspect that the stream closes when error occurs. I expect that flatMap will continue with other streams, however the program doesn't download other files and there is nothing printed out.
What's the correct way to handle error in stream and how to make flatMap not stop after one stream has error?
In imperative programming, I can add debug logs to trace where error happens. How to debug streaming code?
PS. toObjParser is a Node Transform Stream. It takes a readable stream of OSM XML and outputs a stream of objects compatible with Overpass OSM JSON. See https://www.npmjs.com/package/osm2obj
2017-12-19 update:
I tried to call push in errors as #amsross suggested. To verify if push really works, I pushed a XML document and it was parsed by following parser and I saw it from output. However, stream still stopped and url_3 was not downloaded.
function download(url) {
console.log('download', url);
return _(request(url))
.through(zlib.createGunzip())
.errors((err, push) => {
console.log('Error in gunzip', err);
push(null, Buffer.from(`<?xml version='1.0' encoding='UTF-8'?>
<osmChange version="0.6">
<delete>
<node id="1" version="2" timestamp="2008-10-15T10:06:55Z" uid="5553" user="foo" changeset="1" lat="30.2719406" lon="120.1663723"/>
</delete>
</osmChange>`));
})
.through(new OsmToObj())
.errors((err) => console.log('Error in OsmToObj', err));
}
const urlList = ['url_1_correct', 'url_2_wrong', 'url_3_correct'];
_(urlList)
.flatMap(download)
.each(console.log);
Update 12/19/2017:
Ok, so I can't give you a good why on this, but I can tell you that switching from consuming the streams resulting from download in sequence to merge'ing them together will probably give you the result you're after. Unfortunately (or not?), you will no longer be getting the results back in any prescribed order.
const request = require('request')
const zlib = require('zlib')
const h = require('highland')
// just so you can see there isn't some sort of race
const rnd = (min, max) => Math.floor((Math.random() * (max - min))) + min
const delay = ms => x => h(push => setTimeout(() => {
push(null, x)
push(null, h.nil)
}, ms))
const download = url => h(request(url))
.flatMap(delay(rnd(0, 2000)))
.through(zlib.createGunzip())
h(['urlh1hcorrect', 'urlh2hwrong', 'urlh3hcorrect'])
.map(download).merge()
// vs .flatMap(download) or .map(download).sequence()
.errors(err => h.log(err))
.each(h.log)
Update 12/03/2017:
When an error is encountered on the stream, it ends that stream. To avoid this, you need to handle the error. You are currently using errors to report the error, but not handle it. You can do something like this to move on to the next value in the stream:
.errors((err, push) => {
console.log(err)
push(null) // push no error forward
})
Original:
It's difficult to answer without knowing the input and output types of toObjParser are.
Because through passes a stream of values to the provided function and expects a stream of values in return, your issue may reside in toObjParser having a signature like Stream -> Object, or Stream -> Stream Object, where the errors are occurring on the inner stream, which will not emit any errors until it is consumed.
What is the output of .each(console.log)? If it is logging a stream, that is most likely your problem.
I'm writing an oscillator in JavaScript that creates a sweep(i.e. chirp) between sine wave frequencies. For testing, I'd like to write the samples(which are floats) to a wav file. How would I do this in Node.js? I've seen lots of information on the browser end of things but not anything specific to Node or anything that relies on browser APIs.
This can be done using the minimal package node-wav and a snippet similar to the one below:
First, install the dependency:
npm i node-wav
Then, use something like
let fs = require('fs');
let wav = require('node-wav');
// Parameters for the below data
const size = 5000
const amplitude = 128
const sampleRate = 20
// Generate some random data for testing
const data = (new Array(3)).fill((new Array(size)).fill(Math.random() * amplitude))
let buffer = wav.encode(data, { sampleRate: sampleRate, float: true, bitDepth: 32 });
fs.writeFile("test.wav", buffer, (err) => {
if (err) return console.log(err);
console.log("test.wav written");
});
Considering that you already know the application of your data, you know all the "constant" parameters (size of the output, bitrate, the actual data to be written, bitdepth).
You can use the built in Node.js fs.writeFile() api to write to a file.
By the way I see it all you have to do is loop over your audio samples, add them to a string within the iteration, and put that string into a file like so:
const fs = require("fs");
// Code to generate audio
let audio = "";
samples.forEach((sample) => {
audio += sample;
});
fs.writeFile("path/to/file.wav or .mp3", audio, (err) => {
if (err) return console.error(err);
console.log("File successfully saved!");
});
If I'm understanding your question correctly, then this should work.
I just want to call an external exe from a nodejs-App. This external exe makes some calculations and returns an output the nodejs-App needs. But I have no idea how to make the connection between nodejs and an external exe. So my questions:
How do I call an external exe-file with specific arguments from within nodejs properly?
And how do I have to transmit the output of the exe to nodejs efficiently?
Nodejs shall wait for the output of the external exe. But how does nodejs know when the exe has finished its processing? And then how do I have to deliver the result of the exe? I don't want to create a temporary text-file where I write the output to and nodejs simply reads this text-file. Is there any way I can directly return the output of the exe to nodejs? I don't know how an external exe can directly deliver its output to nodejs. BTW: The exe is my own program. So I have full access to that app and can make any necessary changes. Any help is welcome...
With child_process module.
With stdout.
Code will look like this
var exec = require('child_process').exec;
var result = '';
var child = exec('ping google.com');
child.stdout.on('data', function(data) {
result += data;
});
child.on('close', function() {
console.log('done');
console.log(result);
});
You want to use child_process, you can use exec or spawn, depending on your needs. Exec will return a buffer (it's not live), spawn will return a stream (it is live). There are also some occasional quirks between the two, which is why I do the funny thing I do to start npm.
Here's a modified example from a tool I wrote that was trying to run npm install for you:
var spawn = require('child_process').spawn;
var isWin = /^win/.test(process.platform);
var child = spawn(isWin ? 'cmd' : 'sh', [isWin?'/c':'-c', 'npm', 'install']);
child.stdout.pipe(process.stdout); // I'm logging the output to stdout, but you can pipe it into a text file or an in-memory variable
child.stderr.pipe(process.stderr);
child.on('error', function(err) {
logger.error('run-install', err);
process.exit(1); //Or whatever you do on error, such as calling your callback or resolving a promise with an error
});
child.on('exit', function(code) {
if(code != 0) return throw new Error('npm install failed, see npm-debug.log for more details')
process.exit(0); //Or whatever you do on completion, such as calling your callback or resolving a promise with the data
});
Using Node.js, what is the best way to stream a file from a filesystem into Node.js, but reading it backwards, from bottom to top? I have a large file, and there doesn't seem to be much sense in reading from the top if I only want the last 10 lines. Is this possible?
Right now I have this horrible code, where we do a GET request with a browser to view the server logs, and pass a query string parameter to tell the server how many lines at the end of the log file we want to read:
function get(req, res, next) {
var numOfLinesToRespondWith = req.query.num_lines || 10;
var fileStream = fs.createReadStream(stderr_path, {encoding: 'utf8'});
var jsonData = []; //where jsonData gets populated
var ret = [];
fileStream.on('data', function processLineOfFileData(chunk) {
jsonData.push(String(chunk));
})
.on('end', function handleEndOfFileData(err) {
if (err) {
log.error(colors.bgRed(err));
res.status(500).send({"error reading from smartconnect_stdout_log": err.toString()});
}
else {
for(var i = 0; i < numOfLinesToRespondWith; i++){
ret.push(jsonData.pop());
}
res.status(200).send({"smartconnect_stdout_log": ret});
}
});
}
the code above reads the whole file and then adds the number of lines requested to the response after reading the whole file. This is bad, is there a better way to do this? Any recommendations will be met gladly.
(one problem with the code above is that it's writing out the last lines of the log but the lines are in reverse order...)
One potential way to do this is:
process.exec('tail -r ' + file_path).pipe(process.stdout);
but that syntax is incorrect - so my question there would be - how do I pipe the result of that command into an array in Node.js and eventually into a JSON HTTP response?
I created a module called fs-backwards-stream that could may meet your needs. https://www.npmjs.com/package/fs-backwards-stream
If you need the result parsed by lines rather than byte chunks you should use the module fs-reverse https://www.npmjs.com/package/fs-reverse or
both of these modules stream you could simply read the last n bytes of a file.
here is an example using plain node fs apis and no dependencies.
https://gist.github.com/soldair/f250fb497ce592c3694a
hope that helps.
One easy way if you're on a linux computer would be to execute the tac command in node as process.exec("tac yourfile.dat") and pipe it to your write stream
You could also use slice-file and then reverse the order yourself.
Also, look at what #alexmills said in the comments
this is the best answer I got, for now
the tail command on Mac/UNIX reads files from the end and pipes to stdout (correct me if this is loose language)
var cp = require('child_process');
module.exports = function get(req, res, next) {
var numOfLinesToRespondWith = req.query.num_lines || 100;
cp.exec('tail -n 5 ' + stderr_path, function(err,stdout,stderr){
if(err){
log.error(colors.bgRed(err));
res.status(500).send({"error reading from smartconnect_stderr_log": err.toString()});
}
else{
var data = String(stdout).split('\n');
res.status(200).send({"stderr_log": data});
}
});
}
this seems to work really well - it does, however, run on separate process which is expensive in it's own way, but probably better than reading an entire 10,000 line log file.
I'm trying to implement a routine for Node.js that would allow one to open a file, that is being appended to by some other process at this very time, and then return chunks of data immediately as they are appended to file. It can be thought as similar to tail -f UNIX command, however acting immediately as chunks are available, instead of polling for changes over time. Alternatively, one can think of it as of working with a file as you do with socket — expecting on('data') to trigger from time to time until a file is closed explicitly.
In C land, if I were to implement this, I would just open the file, feed its file descriptor to select() (or any alternative function with similar designation), and then just read chunks as file descriptor is marked "readable". So, when there is nothing to be read, it won't be readable, and when something is appended to file, it's readable again.
I somewhat expected this kind of behavior for following code sample in Javascript:
function readThatFile(filename) {
const stream = fs.createReadStream(filename, {
flags: 'r',
encoding: 'utf8',
autoClose: false // I thought this would prevent file closing on EOF too
});
stream.on('error', function(err) {
// handle error
});
stream.on('open', function(fd) {
// save fd, so I can close it later
});
stream.on('data', function(chunk) {
// process chunk
// fs.close() if I no longer need this file
});
}
However, this code sample just bails out when EOF is encountered, so I can't wait for new chunk to arrive. Of course, I could reimplement this using fs.open and fs.read, but that somewhat defeats Node.js purpose. Alternatively, I could fs.watch() file for changes, but it won't work over network, and I don't like an idea of reopening file all the time instead of just keeping it open.
I've tried to do this:
const fd = fs.openSync(filename, 'r'); // sync for readability' sake
const stream = net.Socket({ fd: fd, readable: true, writable: false });
But had no luck — net.Socket isn't happy and throws TypeError: Unsupported fd type: FILE.
So, any solutions?
UPD: this isn't possible, my answer explains why.
I haven't looked into the internals of the read streams for files, but it's possible that they don't support waiting for a file to have more data written to it. However, the fs package definitely supports this with its most basic functionality.
To explain how tailing would work, I've written a somewhat hacky tail function which will read an entire file and invoke a callback for every line (separated by \n only) and then wait for the file to have more lines written to it. Note that a more efficient way of doing this would be to have a fixed size line buffer and just shuffle bytes into it (with a special case for extremely long lines), rather than modifying JavaScript strings.
var fs = require('fs');
function tail(path, callback) {
var descriptor, bytes = 0, buffer = new Buffer(256), line = '';
function parse(err, bytesRead, buffer) {
if (err) {
callback(err, null);
return;
}
// Keep track of the bytes we have consumed already.
bytes += bytesRead;
// Combine the buffered line with the new string data.
line += buffer.toString('utf-8', 0, bytesRead);
var i = 0, j;
while ((j = line.indexOf('\n', i)) != -1) {
// Callback with a single line at a time.
callback(null, line.substring(i, j));
// Skip the newline character.
i = j + 1;
}
// Only keep the unparsed string contents for next iteration.
line = line.substr(i);
// Keep reading in the next tick (avoids CPU hogging).
process.nextTick(read);
}
function read() {
var stat = fs.fstatSync(descriptor);
if (stat.size <= bytes) {
// We're currently at the end of the file. Check again in 500 ms.
setTimeout(read, 500);
return;
}
fs.read(descriptor, buffer, 0, buffer.length, bytes, parse);
}
fs.open(path, 'r', function (err, fd) {
if (err) {
callback(err, null);
} else {
descriptor = fd;
read();
}
});
return {close: function close(callback) {
fs.close(descriptor, callback);
}};
}
// This will tail the system log on a Mac.
var t = tail('/var/log/system.log', function (err, line) {
console.log(err, line);
});
// Unceremoniously close the file handle after one minute.
setTimeout(t.close, 60000);
All that said, you should also try to leverage the NPM community. With some searching, I found the tail-stream package which might do what you want, with streams.
Previous answers have mentioned tail-stream's approach which uses fs.watch, fs.read and fs.stat together to create the effect of streaming the contents of the file. You can see that code in action here.
Another, perhaps hackier, approach might be to just use tail by spawning a child process with it. This of course comes with the limitation that tail must exist on the target platform, but one of node's strengths is using it to do asynchronous systems development via spawn and even on windows, you can execute node in an alternate shell like msysgit or cygwin to get access to the tail utility.
The code for this:
var spawn = require('child_process').spawn;
var child = spawn('tail',
['-f', 'my.log']);
child.stdout.on('data',
function (data) {
console.log('tail output: ' + data);
}
);
child.stderr.on('data',
function (data) {
console.log('err data: ' + data);
}
);
So, it seems people are still looking for an answer to this question for five years already, and there is yet no answer on topic.
In short: you can't. Not in Node.js particularly, you can't at all.
Long answer: there are few reasons for this.
First, POSIX standard clarifies select() behavior in this regard as follows:
File descriptors associated with regular files shall always select true for ready to read, ready to write, and error conditions.
So, select() can't help with detecting a write beyond the file end.
With poll() it's similar:
Regular files shall always poll TRUE for reading and writing.
I can't tell for sure with epoll(), since it's not standartized and you have to read quite lengthy implementation, but I would assume it's similar.
Since libuv, which is in core of Node.js implementation, uses read(), pread() and preadv() in its uv__fs_read(), neither of which would block when invoked at the end of file, it would always return empty buffer when EOF is encountered. So, no luck here too.
So, summarizing, if such functionality is desired, something must be wrong with your design, and you should revise it.
What you're trying to do is a FIFO file (acronym for First In First Out), which as you said works like a socket.
There's a node.js module that allows you to work with fifo files.
I don't know what do you want that for, but there are better ways to work with sockets on node.js. Try socket.io instead.
You could also have a look at this previous question:
Reading a file in real-time using Node.js
Update 1
I'm not familiar with any module that would do what you want with a regular file, instead of with a socket type one. But as you said, you could use tail -f to do the trick:
// filename must exist at the time of running the script
var filename = 'somefile.txt';
var spawn = require('child_process').spawn;
var tail = spawn('tail', ['-f', filename]);
tail.stdout.on('data', function (data) {
data = data.toString().replace(/^[\s]+/i,'').replace(/[\s]+$/i,'');
console.log(data);
});
Then from the command line try echo someline > somefile.txt and watch at the console.
You might also would like to have a look at this: https://github.com/layerssss/node-tailer