Pipe spawned process stdout to function on flush - javascript

My goal is to spawn another binary file in a child process, then handle the stdout on a line-by-line basis (And do some processing against that line). For testing I'm using Node. To do this I tried a readable & writable stream but it a type error is throwed saying "The argument 'stdio' is invalid. Received Writable"
const rs = new stream.Readable();
const ws = new stream.Writable();
const child = cp.spawn("node", [], {
stdio: [process.stdin, ws, process.stderr]
})
let count = 0;
ws.on("data", (data) => {
console.log(data, count)
});
Anyone have any ideas?

One way is to use the stdio streams returned by spawn and manually pipe them:
const child = cp.spawn("node", [])
child.stdin.pipe(process.stdin)
child.stdout.pipe(ws)
child.stderr.pipe(process.stderr)
let count = 0;
ws.on("data", (data) => {
console.log(data, count)
});

Related

How to use Cspell --file-list stdin

I wanted to use cspell --file-list command as a child process in Node Js.
I wanted to pass large array of strings to this child process and feed it by stdin.
var child = spawn('cspell --file-list',[], {shell:true});
Now I wanted to pass strings one by one to this child process.
Can someone help me in this with small example.
send files as an argument:
const spawn = require('child_process').spawn;
const cmd = 'cspell';
const checkFiles = (files) => {
const proc = spawn(cmd, ['--file-list'].concat(files), {shell: true });
const buffers = [];
proc.stdout.on('data', (chunk) => buffers.push(chunk));
proc.stderr.on('data', (data) => {
console.error(`stderr: ${data.toString()}`);
});
proc.stdout.on('end', () => {
const result = (Buffer.concat(buffers)).toString();
console.log(`done, result:\n${result}`);
});
};
// pass files array
checkFiles(['some-file', 'another-file']);

How can I open an Image file using fs in NodeJS?

In my current codes, it does only can read a text file, How can I make an Image (base64) file opened with Photos Application (Windows)? Is there any chance to do that? If it's impossible, please let me know!
const fs = require('fs')
fs.readFile('./Test/a.txt', 'utf8' , (err, data) => {
if (err) {
console.error(err)
return
}
console.log(data)
return
})
Another possible solution is like this:
const cp = require('child_process');
const imageFilePath = '/aaa/bbb/ccc'
const c = cp.spawn('a_program_that_opens_images', [ `"${imageFilePath}"` ]);
c.stdout.pipe(process.stdout);
c.stderr.pipe(process.stderr);
c.once('exit', exitCode => {
// child process has exited
});
Do something like this:
const cp = require('child_process');
const c = cp.spawn('bash'); // 1
const imageFilePath = '/aaa/bbb/ccc'
c.stdin.end(`
program_that_opens_images "${imageFilePath}"
`); // 2
c.stdout.pipe(process.stdout); // 3
c.stderr.pipe(process.stderr);
c.once('exit', exitCode => { // 4
// child process has exited
});
what it does:
spawns a bash child process (use sh or zsh instead if you want)
writes to bash stdin, (inputting the command to run)
pipes the stdio from the child to the parent
captures the exit code from the child

Communicating between NodeJS and Python: Passing back multiple arguments

As of right now I am using the built in child_process to start up the Python script and listen for any data passed back via stdout.on('data', (data)) like in line 6 of the first JS code. But from the Google searches I have done I only see examples of one thing being passed back or a group of things being passed back all clumped together. I was wondering if it was possible to send back more than just one argument. Below is my code:
JS:
const spawn = require('child_process').spawn;
pythonProcess = spawn('python', ["/path/to/python/file"]);
pythonProcess.stdout.on('data', (data) => {
console.log(data);
});
Python:
import sys
var thing1 = "Cold";
var thing2 = "Hot";
var thing3 = "Warm";
print(thing1);
print(thing2);
print(thing3);
sys.stdout.flush();
But what I want to happen is maybe pass back something like an array which is filled with the things I want to send back so that I can access them in the JS file like so:
const spawn = require('child_process').spawn;
pythonProcess = spawn('python', ["/path/to/python/file"]);
pythonProcess.stdout.on('data', (data) => {
thing1 = data[0];
thing2 = data[1];
thing3 = data[2];
})
console.log('thing1: ' + thing1);
console.log('thing2: ' + thing2);
console.log('thing3: ' + thing3);
Which would output:
thing1: Hot
thing2: Cold
thing3: Warm
How would I do this?
Thanks in advance!
There isn't an interface that communicate directly between Node.js and Python, so you can't pass custom arguments, what you're doing is just executing a python program using child_process, so you don't send arguments, anything that is received on 'data' its what is printed to stdout from python.
So what you need to do, is serialize the data, and then deserialize it in Node, you can use JSON for this.
From your python script, output the following JSON object:
{
"thing1": "Hot",
"thing2": "Cold",
"thing3": "Warm"
}
And in your Node.js script:
const spawn = require('child_process').spawn;
const pythonProcess = spawn('python', ["/path/to/python/file"]);
const chunks = [];
pythonProcess.stdout.on('data', chunk => chunks.push(chunk));
pythonProcess.stdout.on('end', () => {
try {
// If JSON handle the data
const data = JSON.parse(Buffer.concat(chunks).toString());
console.log(data);
// {
// "thing1": "Hot",
// "thing2": "Cold",
// "thing3": "Warm"
// }
} catch (e) {
// Handle the error
console.log(result);
}
});
Have in mind that data is chunked, so will have to wait until the end event is emitted before parsing the JSON, otherwise a SyntaxError will be triggered. (Sending JSON from Python to Node via child_process gets truncated if too long, how to fix?)
You can use any type of serialization you feel comfortable with, JSON is the easiest since we're in javascript.
Note that stdout is a stream, so it's asyncrhonous, that's why your example would never work.
pythonProcess.stdout.on('data', (data) => {
thing1 = data[0];
thing2 = data[1];
thing3 = data[2];
})
// Things do not exist here yet
console.log('thing1: ' + thing1);
console.log('thing2: ' + thing2);
console.log('thing3: ' + thing3);

Node csv-parse halting after 16 rows

I'm experiencing very weird behavior running csv-parse in the following setup:
csv - ^1.1.0
stream-transform - ^0.1.1
node - v4.6.0
And running the following code to transform CSVs into an array of objects:
var parse = require('csv').parse
var fs = require('fs')
var streamtransform = require('stream-transform')
function mapCsvRow(headers, record) {
return record.reduce((p, c, i) => {
p[headers[i]] = c //eslint-disable-line
return p
}, {})
}
function parseFile(path) {
var headers
var output = []
var parser = parse({ delimiter: ',' })
var input = fs.createReadStream(path)
var transformer = streamtransform((record) => {
if (!headers) {
headers = record
return record
}
output.push(mapCsvRow(headers, record))
return record
})
// Return a new promise to wrap the parsing stream
return new Promise((resolve, reject) => {
input
.pipe(parser)
.pipe(transformer)
.on('error', e => reject(e))
.on('finish', () => resolve(output))
})
}
module.exports = parseFile
module.exports = parseFile
What happens is that the parser halts on processing files larger than 16 records. No error, no finish, no nothing.
I have no idea how to debug this, I couldn't get any input from the parser when that happens.
Looks like you have reader stream and transformer stream, but you don't have any writer stream. Hence transformer stream gets full and pauses read stream.
Try rewrite your code to not use output array. It's pointless to use stream if you hold results in memory.

Node.js - How can I prevent interrupted child processes from surviving?

I have found that some child processes are failing to terminate if the calling script is interrupted.
Specifically, I have a module that uses Ghostscript to perform various actions: extract page images, create a new pdf from a slice, etc. I use the following to execute the command and return a through stream of the child's stdout:
function spawnStream(command, args, storeStdout, cbSuccess) {
storeStdout = storeStdout || false;
const child = spawn(command, args);
const stream = through(data => stream.emit('data', data));
let stdout = '';
child.stdout.on('data', data => {
if (storeStdout === true) stdout += data;
stream.write(data);
});
let stderr = '';
child.stderr.on('data', data => stderr += data);
child.on('close', code => {
stream.emit('end');
if (code > 0) return stream.emit('error', stderr);
if (!!cbSuccess) cbSuccess(stdout);
});
return stream;
}
This is invoked by function such as:
function extractPage(pathname, page) {
const internalRes = 96;
const downScaleFactor = 1;
return spawnStream(PATH_TO_GS, [
'-q',
'-sstdout=%stderr',
'-dBATCH',
'-dNOPAUSE',
'-sDEVICE=pngalpha',
`-r${internalRes}`,
`-dDownScaleFactor=${downScaleFactor}`,
`-dFirstPage=${page}`,
`-dLastPage=${page}`,
'-sOutputFile=%stdout',
pathname
]);
}
which is consumed, for example, like this:
it('given a pdf pathname and page number, returns the image as a stream', () => {
const document = path.resolve(__dirname, 'samples', 'document.pdf');
const test = new Promise((resolve, reject) => {
const imageBlob = extract(document, 1);
imageBlob.on('data', data => {
// do nothing in this test
});
imageBlob.on('end', () => resolve(true));
imageBlob.on('error', err => reject(err));
});
return Promise.all([expect(test).to.eventually.equal(true)]);
});
When this is interrupted, for example if the test times out or an unhandled error occurs, the child process doesn't seem to receive any signal and survives. It's a bit confusing, as no individual operation is particularly complex and yet the process appears to survive indefinitely, using 100% of CPU.
☁ ~ ps aux | grep gs | head -n 5
rwick 5735 100.0 4.2 3162908 699484 s000 R 12:54AM 6:28.13 gs -q -sstdout=%stderr -dBATCH -dNOPAUSE -sDEVICE=pngalpha -r96 -dDownScaleFactor=1 -dFirstPage=3 -dLastPage=3 -sOutputFile=%stdout /Users/rwick/projects/xan-desk/test/samples/document.pdf
rwick 5734 100.0 4.2 3171100 706260 s000 R 12:54AM 6:28.24 gs -q -sstdout=%stderr -dBATCH -dNOPAUSE -sDEVICE=pngalpha -r96 -dDownScaleFactor=1 -dFirstPage=2 -dLastPage=2 -sOutputFile=%stdout /Users/rwick/projects/xan-desk/test/samples/document.pdf
rwick 5733 100.0 4.1 3154808 689000 s000 R 12:54AM 6:28.36 gs -q -sstdout=%stderr -dBATCH -dNOPAUSE -sDEVICE=pngalpha -r96 -dDownScaleFactor=1 -dFirstPage=1 -dLastPage=1 -sOutputFile=%stdout /Users/rwick/projects/xan-desk/test/samples/document.pdf
rwick 5732 100.0 4.2 3157360 696556 s000 R 12:54AM 6:28.29 gs -q -sstdout=%stderr -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=%stdout /Users/rwick/projects/xan-desk/test/samples/document.pdf /Users/rwick/projects/xan-desk/test/samples/page.pdf
I thought to use a timer to send a kill signal to the child but selecting an arbitrary interval to kill a process seems like it would effectively be trading an known problem for an unknown one and kicking that can down the road.
I would really appreciate any insight into what I'm missing here. Is there a better option to encapsulate child processes so the termination of the parent is more likely to precipitate the child's interrupt?
listen to error event
child.on('error', function(err) {
console.error(err);
// code
try {
// child.kill() or child.disconnect()
} catch (e) {
console.error(e);
}
});

Categories