Apply transform stream to write stream without controlling read stream? - javascript

I have a function that expects a write stream to which I am providing the following stream:
const logStream = fs.createWriteStream('./log.txt')
fn(logStream)
fn is provided by a third-party module, so I do not control its implementation. Internally, I know that fn eventually does this:
// super simplified
fn (logStream) {
// ...
stream.pipe(logStream, { end: true })
// ...
}
My issue is that I know that the read stream stream contains ANSI escape codes which I don't want to be outputted to my log.txt. After a quick google search, I found chalk/strip-ansi-stream, which is a transform stream designed to do just that.
So, being the Node streams newbie that I am, I decided to try to modify my code to this:
const stripAnsiStream = require('strip-ansi-stream')
const logStream = fs.createWriteStream('./log.txt')
fn(stripAnsiStream().pipe(logStream))
... which does not work: my log file still contains content with the ANSI escape codes. I think this is because instead of creating a chain like
a.pipe(b).pipe(c)
I've actually done
a.pipe(b.pipe(c))
How can I apply this transform stream to my write stream without controlling the beginning of the pipe chain where the read stream is provided?

For the purpose of chaining, stream.pipe() returns the input argument. The return value of b.pipe(c) is c.
When you call fn(b.pipe(c)), you're actually bypassing transform stream b and inputting the write stream c directly.
Case #1: a.pipe(b.pipe(c))
b.pipe(c)
a.pipe(c)
Case #2: a.pipe(b).pipe(c)
a.pipe(b)
b.pipe(c)
The transform stream can be piped into the log stream, and then passed into the module separately. You're effectively using case #2, but starting the pipes in reverse order.
const stripAnsiStream = require('strip-ansi-stream')
const fn = require('my-third-party-module')
const transformStream = stripAnsiStream()
const logStream = fs.createWriteStream('./log.txt')
transformStream.pipe(logStream)
fn(transformStream)

Related

How can I pass an ArrayBuffer from JS to AssemblyScript/Wasm?

I have a pretty straightforward piece of Typescript code that parses a specific data format, the input is a UInt8Array. I've optimized it as far as I can, but I think this rather simple parser should be able to run faster than I can make it run as JS. I wanted to try out writing it in web assembly using AssemblyScript to make sure I'm not running into any quirks of the Javascript engines.
As I figured out now, I can't just pass a TypedArray to Wasm and have it work automatically. As far as I understand, I can pass a pointer to the array and should be able to access this directly from Wasm without copying the array. But I can't get this to work with AssemblyScript.
The following is a minimal example that shows how I'm failing to pass an ArrayBuffer to Wasm.
The code to set up the Wasm export is mostly from the automatically generated boilerplate:
const fs = require("fs");
const compiled = new WebAssembly.Module(
fs.readFileSync(__dirname + "/build/optimized.wasm")
);
const imports = {
env: {
abort(msgPtr, filePtr, line, column) {
throw new Error(`index.ts: abort at [${line}:${column}]`);
}
}
};
Object.defineProperty(module, "exports", {
get: () => new WebAssembly.Instance(compiled, imports).exports
});
The following code invokes the WASM, index.js is the glue code above.
const m = require("./index.js");
const data = new Uint8Array([1, 2, 3, 4, 5, 6, 7, 8]);
const result = m.parse(data.buffer);
And the AssemblyScript that is compiled to WASM is the following:
import "allocator/arena";
export function parse(offset: usize): number {
return load<u8>(offset);
}
I get a "RuntimeError: memory access out of bounds" when I execute that code.
The major problem is that the errors I get back from Wasm are simply not helpful to figure this out on my own. I'm obviously missing some major aspects of how this actually works behind the scenes.
How do I actually pass a TypedArray or an ArrayBuffer from JS to Wasm using AssemblyScript?
In AssemblyScript, there are many ways to read data from the memory. The quickest and fastest way to get this data is to use a linked function in your module's function imports to return a pointer to the data itself.
let myData = new Float64Array(100); // have some data in AssemblyScript
// We should specify the location of our linked function
#external("env", "sendFloat64Array")
declare function sendFloat64Array(pointer: usize, length: i32): void;
/**
* The underlying array buffer has a special property called `data` which
* points to the start of the memory.
*/
sendFloat64Data(myData.buffer.data, myData.length);
Then in JavaScript, we can use the Float64Array constructor inside our linked function to return the values directly.
/**
* This is the fastest way to receive the data. Add a linked function like this.
*/
imports.env.sendFloat64Array = function sendFloat64Array(pointer, length) {
var data = new Float64Array(wasmmodule.memory.buffer, pointer, length);
};
However, there is a much clearer way to obtain the data, and it involves returning a reference from AssemblyScript, and then using the AssemblyScript loader.
let myData = new Float64Array(100); // have some data in AssemblyScript
export function getData(): Float64Array {
return myData;
}
Then in JavaScript, we can use the ASUtil loader provided by AssemblyScript.
import { instantiateStreaming } from "assemblyscript/lib/loader";
let wasm: ASUtil = await instantiateStreaming(fetch("myAssemblyScriptModule.wasm"), imports);
let dataReference: number = wasm.getData();
let data: Float64Array = wasm.getArray(Float64Array, dataReference);
I highly recommend using the second example for code clarity reasons, unless performance is absolutely critical.
Good luck with your AssemblyScript project!

Evalute JS file with template strings from another file

I would like to make use of a function called executeJavaScript() from the Electron webContents API. Since it is very close to eval() I will use this in the example.
The problem:
I have a decent sized script but it is contained in a template string.
Expanding this app, the script could grow a lot as a string.
I am not sure what the best practices are for this.
I also understand that eval() is dangerous, but I am interested in the principal of my question.
Basic eval example for my question:
// Modules
const fs = require('fs');
// CONSTANTS
const EXAMPLE_1 = 'EXAMPLE_1';
const EXAMPLE_2 = 'EXAMPLE_2';
const EXAMPLE_3 = 'EXAMPLE_3';
const exampleScriptFunction = require('./exampleScriptFunction');
const exampleScriptFile = fs.readFileSync('./exampleScriptFile.js');
// using direct template string
eval(`console.log(${EXAMPLE_1})`);
// using a method from but this doesnt solve the neatness issue.
eval(exampleScriptFunction(EXAMPLE_2));
// What I want is to just use a JS file because it is neater.
eval(`${exampleScriptFile}`);
exampleScriptFunction.js
module.exports = function(fetchType) {
return `console.log(${fetchType});`;
}
This will allow me to separate the script to a new file
what if I have many more then 1 variable???
exampleScriptFile.js:
console.log(${EXAMPLE_3});
This clearly does not work, but I am just trying to show my thinking.
back ticks are not present, fs loads as string, main file has back ticks.
This does not work. I do not know how else to show what I mean.
Because I am loading this will readFileSync, I figured the es6 template string would work.
This allows me to write a plain js file with proper syntax highlighting
The issue is the variables are on the page running the eval().
Perhaps I am completely wrong here and looking at this the wrong way. I am open to suggestions. Please do not mark me minus 1 because of my infancy in programming. I really do not know how else to ask this question. Thank you.
Assuming your source is stored in exampleScriptFile:
// polyfill
const fs = { readFileSync() { return 'console.log(`${EXAMPLE_3}`);'; } };
// CONSTANTS
const EXAMPLE_1 = 'EXAMPLE_1';
const EXAMPLE_2 = 'EXAMPLE_2';
const EXAMPLE_3 = 'EXAMPLE_3';
const exampleScriptFile = fs.readFileSync('./exampleScriptFile.js');
// What I want is to just use a JS file because it is neater.
eval(exampleScriptFile);
Update
Perhaps I wasn't clear. The ./exampleScriptFile.js should be:
console.log(`${EXAMPLE_3}`);
While what you're describing can be done with eval as #PatrickRoberts demonstrates, that doesn't extend to executeJavaScript.
The former runs in the caller's context, while the latter triggers an IPC call to another process with the contents of the code. Presumably this process doesn't have any information on the caller's context, and therefore, the template strings can't be populated with variables defined in this context.
Relevant snippets from electron/lib/browsers/api/web-contents.js:
WebContents.prototype.send = function (channel, ...args) {
// ...
return this._send(false, channel, args)
}
// ...
WebContents.prototype.executeJavaScript = function (code, hasUserGesture, callback) {
// ...
return asyncWebFrameMethods.call(this, requestId, 'executeJavaScript',
// ...
}
// ...
const asyncWebFrameMethods = function (requestId, method, callback, ...args) {
return new Promise((resolve, reject) => {
this.send('ELECTRON_INTERNAL_RENDERER_ASYNC_WEB_FRAME_METHOD', requestId, method, args)
// ...
})
}
Relevant snippets from electron/atom/browser/api/atom_api_web_contents.cc
//...
void WebContents::BuildPrototype(v8::Isolate* isolate,
v8::Local<v8::FunctionTemplate> prototype) {
prototype->SetClassName(mate::StringToV8(isolate, "WebContents"));
mate::ObjectTemplateBuilder(isolate, prototype->PrototypeTemplate())
// ...
.SetMethod("_send", &WebContents::SendIPCMessage)
// ...
}

How to read stream of JSON objects per object

I have a binary application which generates a continuous stream of json objects (not an array of json objects). Json object can sometimes span multiple lines (still being a valid json object but prettified).
I can connect to this stream and read it without problems like:
var child = require('child_process').spawn('binary', ['arg','arg']);
child.stdout.on('data', data => {
console.log(data);
});
Streams are buffers and emit data events whenever they please, therefore I played with readline module in order to parse the buffers into lines and it works (I'm able to JSON.parse() the line) for Json objects which don't span on multiple lines.
Optimal solution would be to listen on events which return single json object, something like:
child.on('json', object => {
});
I have noticed objectMode option in streams node documentation however I' getting a stream in Buffer format so I belive I'm unable to use it.
Had a look at npm at pixl-json-stream, json-stream but in my opinnion none of these fit the purpose. There is clarinet-object-stream but it would require to build the json object from ground up based on the events.
I'm not in control of the json object stream, most of the time one object is on one line, however 10-20% of the time json object is on multiple lines (\n as EOL) without separator between objects. Each new object always starts on a new line.
Sample stream:
{ "a": "a", "b":"b" }
{ "a": "x",
"b": "y", "c": "z"
}
{ "a": "a", "b":"b" }
There must be a solution already I'm just missing something obvious. Would rather find appropriate module then to hack with regexp the stream parser to handle this scenario.
I'd recommend to try parsing every line:
const readline = require('readline');
const rl = readline.createInterface({
input: child.stdout
});
var tmp = ''
rl.on('line', function(line) {
tmp += line
try {
var obj = JSON.parse(tmp)
child.emit('json', obj)
tmp = ''
} catch(_) {
// JSON.parse may fail if JSON is not complete yet
}
})
child.on('json', function(obj) {
console.log(obj)
})
As the child is an EventEmitter, one can just call child.emit('json', obj).
Having the same requirement, I was uncomfortable enforcing a requirement for newlines to support readline, needed to be able to handle starting the read in the middle of a stream (possibly the middle of a JSON document), and didn't like constantly parsing and checking for errors (seemed inefficient).
As such I preferred using the clarinet sax parser, collecting the documents as I went and emitting doc events once whole JSON documents have been parsed.
I just published this class to NPM
https://www.npmjs.com/package/json-doc-stream

How can I create a new Buffer class?

Okay, not completely sure how to phrase this, but I'm gonna try my best.
I have been trying to use node's new vm module, and I wanted to enable Buffer support for the code running within the vm.
Here is the initial code I was using:
let vm = require('vm');
let someCode = '... some unsafe code ...';
let result = vm.runInNewContext(
someCode,
{ Buffer: Buffer }
);
However, as I quickly discovered, this will return the node process's Buffer class, so if someone were to modify Buffer.prototype within the vm, this will also change the Buffer outside the vm.
Therefore, after looking at the docs I tried changing the sandbox object to:
{ Buffer: require('buffer').Buffer }
After some checking, I discovered to my dismay that this doesn't work as expected either, since require somehow returns the same object:
let b = require('buffer').Buffer;
console.log(b === Buffer); // This becomes true
Is it possible to create a new Buffer class using the vm, or will I have to further safeguard by using child_process in some fancy way?

Node.js: Capture STDOUT of `child_process.spawn`

I need to capture in a custom stream outputs of a spawned child process.
child_process.spawn(command[, args][, options])
For example,
var s = fs.createWriteStream('/tmp/test.txt');
child_process.spawn('ifconfig', [], {stdio: [null, s, null]})
Now how do I read from the /tmp/test.txt in real time?
It looks like child_process.spawn is not using stream.Writable.prototype.write nor stream.Writable.prototype._write for its execution.
For example,
s.write = function() { console.log("this will never get printed"); };
As well as,
s.__proto__._write = function() { console.log("this will never get printed"); };
It looks like it uses file descriptors under-the-hood to write from child_process.spawn to a file.
Doing this does not work:
var s2 = fs.createReadStream('/tmp/test.txt');
s2.on("data", function() { console.log("this will never get printed either"); });
So, how can I get the STDOUT contents of a child process?
What I want to achieve is to stream STDOUT of a child process to a socket. If I provide the socket directly to the child_process.spawn as a stdio parameter it closes the socket when it finishes, but I want to keep it open.
Update:
The solution is to use default {stdio: ['pipe', 'pipe', 'pipe']} options and listen to the created .stdout of the child process.
var cmd = child_process.spaw('ifconfig');
cmd.stdout.on("data", (data) => { ... });
Now, to up the ante, a more challenging question:
-- How do you read the STDOUT of the child process and still preserve the colors?
For example, if you send STDOUT to process.stdout like so:
child_process.spawn('ifconfig', [], {stdio: [null, process.stdout, null]});
it will keep the colors and print colored output to the console, because the .isTTY property is set to true on process.stdout.
process.stdout.isTTY // true
Now if you use the default {stdio: ['pipe', 'pipe', 'pipe']}, the data you will read will be stripped of console colors. How do you get the colors?
One way to do that would be creating your own custom stream with fs.createWriteStream, because child_process.spawn requires your streams to have a file descriptor.
Then setting .isTTY of that stream to true, to preserve colors.
And finally you would need to capture the data what child_process.spawn writes to that stream, but since child_process.spawn does not use .prototype.write nor .prototype._write of the stream, you would need to capture its contents in some other hacky way.
That's probably why child_process.spawn requires your stream to have a file descriptor because it bypasses the .prototype.write call and writes directly to the file under-the-hood.
Any ideas how to implement this?
You can do it without using a temporary file:
var process = child_process.spawn(command[, args][, options]);
process.stdout.on('data', function (chunk) {
console.log(chunk);
});
Hi I'm on my phone but I will try to guide you as I can. I will clarify when near a computer if needed
What I think you want is to read the stdout from a spawn and do something with the data?
You can give the spawn a variable name instead of just running the function, e.g:
var child = spawn();
Then listen to the output like:
child.stdout.on('data', function(data) {
console.log(data.toString());
});
You could use that to write the data then to a file or whatever you may want to do with it.
The stdio option requires file descriptors, not stream objects, so one way to do it is use use fs.openSync() to create an output file descriptor and us that.
Taking your first example, but using fs.openSync():
var s = fs.openSync('/tmp/test.txt', 'w');
var p = child_process.spawn('ifconfig', [], {stdio: [process.stdin, s, process.stderr]});
You could also set both stdout and stderr to the same file descriptor (for the same effect as bash's 2>&1).
You'll need to close the file when you are done, so:
p.on('close', function(code) {
fs.closeSync(s);
// do something useful with the exit code ...
});

Categories