JavaScript can't receive child stream one line at a time - javascript

When using child_process.spawn in Node, it spawns a child process and automatically create stdin, stdout and stderr streams to interact with the child.
const child = require('child_process');
const subProcess = child.spawn("python", ["myPythonScript.py"])
subProcess.stdout.on('data', function(data) {
console.log('stdout: ' + data);
});
I thus imlemented this in my project but the thing is that the subprocess actually write on the output stream only when the buffer reach a certain size. And not when the buffer is set with data (whatever the size of the data).
Indeed, i'd like to receive the subprocess output stream directly when it writes it on the output stream, and not when it has filled the whole buffer. any solution ?
EDIT: As pointed out by t.888, it should actually be working as i expect. And it actually does if I spawn another subprocess. A c++ one this time. But I don't know why it does not work when I spawn my python script. Actually, the python script sends only big chunks of messages via stdout (probably when the buffer is full)

I think that you need readline instead.
const fs = require('fs');
const readline = require('readline');
async function processLineByLine() {
const fileStream = fs.createReadStream('input.txt');
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity
});
// Note: we use the crlfDelay option to recognize all instances of CR LF
// ('\r\n') in input.txt as a single line break.
for await (const line of rl) {
// Each line in input.txt will be successively available here as `line`.
console.log(`Line from file: ${line}`);
}
}
processLineByLine();
From https://nodejs.org/api/readline.html#readline_example_read_file_stream_line_by_line

I solved my problem yesterday. It was actually due to python itself and not child_process function.
I have to do
const subProcess = child.spawn("python", ["-u", "myPythonScript.py"])
instead of
const subProcess = child.spawn("python", ["myPythonScript.py"])
indeed, -u argument tells python to flush data as soon as possible.

Related

How to read a text file line by line in JavaScript?

I need to read a text file line by line in JavaScript.
I might want to do something with each line (e.g. skip or modify it) and write the line to another file. But the specific actions are out of the scope of this question.
There are many questions with similar wording, but most actually read the whole file into memory in one step instead of reading line by line. So those solutions are unusable for bigger files.
With Node.js a new function was added in v18.11.0 to read files line by line
filehandle.readLines([options])
This is how you use this with a text file you want to read
import { open } from 'node:fs/promises';
myFileReader();
async function myFileReader() {
const file = await open('./TextFileName.txt');
for await (const line of file.readLines()) {
console.log(line)
}
}
To understand more read Node.js documentation here is the link for file system readlines():
https://nodejs.org/api/fs.html#filehandlereadlinesoptions
The code to read a text file line by line is indeed surprisingly non-trivial and hard to discover.
This code uses NodeJS' readline module to read and write text file line by line. It can work on big files.
const fs = require("fs");
const readline = require("readline");
const input_path = "input.txt";
const output_path = "output.txt";
const inputStream = fs.createReadStream(input_path);
const outputStream = fs.createWriteStream(output_path, { encoding: "utf8" });
var lineReader = readline.createInterface({
input: inputStream,
terminal: false,
});
lineReader.on("line", function (line) {
outputStream.write(line + "\n");
});

Retrieving data from python script using 'child-process' works in normal node.js script, but not in a required script

I am making a node.js application where I need to send data to a python script where it preforms some calculations. Then I need to get the data from the python script back to a discord.js command-script command.js, that I run from the main script index.js.
I send the data i need to be calculated with the child_process module:
function func1(arg1, arg2){
let spawn = require('child_process').spawn
const pythonProcess = spawn('python',['t1.py', arg1, arg2]);
}
Then I retrieve and process the data in python with the sys module like this:
import sys
a = sys.argv[1]
b = sys.argv[2]
print(int(a) + int(b))
sys.stdout.flush()
After the data is handled in the python script t1.py, I retrieve it like this and try to log it (the problem is that it doesent log anything when I run the main script):
pythonProcess.stdout.on('data', (data) => {
console.log(Number(data))
});
I then send the resulting data from command.js to index.js using module.exports:
module.exports = {
'func1': func1
}
Finally I require the function exported from the command.js script and run it in main.js while passing in two arguments:
let myFunction = require('./command-file/commands.js')
myFunction.func1(2, 2)
This does not work. I simply get no console.log() message at all in my terminal.
However. If I try to send the data to t1.py directly from index.js without sending the data to command.js first and not exporting the script, it works and returns '4' (Keep in mind I cannot do it this way because of how the rest of the application works, but that is irrelevant to this problem, I have to use command.js). My theory is that for some reason child_process doesent work with module.exports, but I dont know why...
Structure:
.
├── index.js
└── script.py
index.js:
const { spawn } = require('child_process');
const command = spawn('python', ["./script.py", 1, 2]);
let result = '';
command.stdout.on('data', function (data) {
result += data.toString();
});
command.on('close', function (code) {
console.log("RESULT: ", result);
});
script.py:
import sys
a = 0
b = 0
try:
a = sys.argv[1]
b = sys.argv[2]
except:
print("DATA_MISSING")
sys.stdout.write("{}".format(int(a) + int(b)))
Start the code with node index.js

Node.js transform stream hangs until end readstream

At the moment I'm implementing a pipeline with streams for CSV files to write its lines to a db with a certain model. I started out with writing everything to stdout. That let to unexpected behavior quickly, when I attached my custom (Transform) product map stream.
At first I started out using only the fs readstream, piped that to the csv transformer (based on the npm package csv-streamify) and wrote to process.stdout. Everything was flowing like normal.
But then I connected my custom transformer and there it starts to act weird. Because as soon as I apply any operation (JSON.parse, typeof chunk) in to the chunk in the transform, the data will not flow directly to stdout but, as it appears, only when the readstream is done.
Does anybody know why this occurs?
My pipeline:
const filePath = path.join(__dirname, 'file1.csv');
//From NPM module: csv-streamify
const csvTransformer = csv({objectMode: false, columns: true});
const mapper = new CsvMapper();
const productMapStream = ProductMapStream.create(mapper);
const writeStream = ProductWriteStream.create(this.connectionPool);
fs.createReadStream(filePath)
.pipe(csvTransformer)
.pipe(csvProductMapStream)
.pipe(process.stdout)
.on('finish', () => resolve('Ok'))
My custom transform stream:
export class ProductMapStream {
static create(mapper: ProductMappable) {
return new Transform({
transform(chunk: any, enc, callback) {
try {
const chunkAsString = chunk.toString('utf8');
// This already prevents continuous flowing
const newString = '' + (typeof chunkAsString);
this.push(newString);
callback(null);
} catch (e) {
callback(e);
}
}
}).once('error', console.log)
}
}
EDIT:
After some experimentation based on the comments of #PatrickDillon I've found out that this problem only occurs when it's run inside a Docker container. I've tried different node versions based on Docker node images. I started out with node:8.12.0-alpine and I've also tried node:10.11.0-jessie but no difference in behavior unfortunately. Does anybody know of special behavior of when using Docker with fs or streams or anything that might seem related?

How do I configure the concurrency of node.js?

Is there a way to configure the maximum capacity of Node.js? For example, say, I have 5 URLs, but with limited hardware resource, I only want to process 2 at a time. Is there an option that I can set in Node.js, such that I don't need to control it in my code?
urls.txt
https://example.com/1
https://example.com/2
https://example.com/3
https://example.com/4
https://example.com/5
index.js
const readline = require('readline')
const fs = require('fs')
const rl = readline.createInterface({
input: fs.createReadStream('urls.txt')
})
rl.on('line', (input) => {
console.log(`Do something with: ${input}`);
})
Is there an option that I can set in Node.js, such that I don't need to control it in my code?
Nope. That's what your code is for.
Alternatively, let the kernel handle it, since you're concerned about system resources.

Pipe a javascript variable into shell command with nodejs

I'm working on a nodejs application and I need to pipe a multi-line string into a shell command. I'm not a pro at shell scripting but if I run this command in my terminal it works just fine:
$((cat $filePath) | dayone new)
Here's what I've got for the nodejs side. The dayone command does work but there is nothing piped into it.
const cp = require('child_process');
const terminal = cp.spawn('bash');
var multiLineVariable = 'Multi\nline\nstring';
terminal.stdin.write('mul');
cp.exec('dayone new', (error, stdout, stderr) => {
console.log(error, stdout, stderr);
});
terminal.stdin.end();
Thanks for any help!
Here, you're starting up bash using spawn, but then you're using exec to start your dayone program. They are separate child processes and aren't connected in any way.
'cp' is just a reference to the child_process module, and spawn and exec are just two different ways of starting child processes.
You could use bash and write your dayone command to stdin in order to invoke dayone (as your snippet seems to be trying to do), or you could just invoke dayone directly with exec (bear in mind exec still runs the command in a shell):
var multiLineVariable = 'Multi\nline\nstring';
// get the child_process module
const cp = require('child_process');
// open a child process
var process = cp.exec('dayone new', (error, stdout, stderr) => {
console.log(error, stdout, stderr);
});
// write your multiline variable to the child process
process.stdin.write(multiLineVariable);
process.stdin.end();
With Readable Streams it's really easy to listen to the input
const chunks = [];
process.stdin.on('readable', () => {
const chunk = process.stdin.read()
chunks.push(chunk);
if (chunk !== null) {
const result = Buffer.concat(chunks);
console.log(result.toString());
}
});
With Writable Streams you can write to the stdout
process.stdout.write('Multi\nline\nstring');
Hopefully I could help you

Categories