Get data from Stream.Writable into a string variable - javascript

I am using the #kubernetes/client-node library.
My end goal is to execute commands (say "ls") and get the output for further processing.
The .exec() method requires providing two Writeable streams (for the WebSocket to write the output to), and one Readable stream (for pushing our commands to).
The code I have looks something like this:
const outputStream = new Stream.Writable();
const commandStream = new Stream.Readable();
const podExec = await exec.exec(
"myNamespace",
"myPod",
"myContainer",
["/bin/sh", "-c"],
outputStream,
outputStream,
commandStream,
true
);
commandStream.push("ls -l\n");
// get the data from Writable stream here
outputStream.destroy();
commandStream.destroy();
podExec.close();
I am pretty new to JS and am having trouble getting the output from the Writable stream since it doesn't allow direct reading.
Creating a Writable stream to a file and then reading from it seems unnecessarily overcomplicated.
I would like to write the output as a string to a variable.
Has anyone encountered the same task before, and if so, what can you suggest to get the command output?
I would appreciate any help on this matter!

Related

AWS IoT-Core how to properly publish a actual new line in a JSON value, not \r\n or \n

I have a lambda (nodeJs) that reads a file (.ref) in a S3 bucket and publishes its content in a topic inside the AWS IoT-Core broker.
The file contains something like this (50 lines):
model:type
aa;tp1
bb;tpz
cc;tpf
dd;tp1
The code must remove the first line and retrieve the remains 50 lines. This is the code
async function gerRef (BUCKET_NAME) {
const refFile = version + '/file.ref'
const ref = await getObject(FIRMWARE_BUCKET_NAME, refFile)
//Get the file content
const refString = ref.toString('utf8')
//Slip by each line
let arrRef = refString.split('\n')
//Remove the file header
arrRef.shift()
//join the lines
let refString = arrRef.join('\n')
return refString
}
Then I am getting this result and publishing in the AWS IoT-Core Broker like this:
const publishMqtt = (params) =>
new Promise((resolve, reject) =>
iotdata.publish(params, (err, res) => resolve(res)))
...
let refData = await gerRef (bucket1)
let JsonPayload = {
"attr1":"val1",
"machineConfig":`${refData}` //Maybe here is the issue
}
let params = {
topic: 'test/1',
payload: JSON.stringify(JsonPayload) //Maybe here is the issue
qos: '0'
};
await publishMqtt(params)
...
Then it publishes in the broker.
The issue is that the content is being published without a real new line. When I see in the broker I get the follow JSON:
{
"attr1":"val1",
"machineConfig":"aa;tp1\nbb;tpz\ncc;tpf\ndd;tp1"
}
The machine that receives this message is expecting a real new line, something like this:
{
"attr1":"val1",
"machineConfig":"aa;tp1
bb;tpz
cc;tpf
dd;tp1"
}
If I just copy and paste this entire JSON in the AWS IoT-Core interface it will complain about the JSON parse but will publish as string and the machine will accept the data- because the new line is there:
In short, the main point here is that:
We can use the JSON.stringify(JsonPayload) - The broker will accept
I don't know how to stringfy and keep the actual new line
I have tried these solutions but none of then worked: s1, s2, s3
Any guess in how to achieve this?
What that machines is expecting is wrong. In JSON any newline data inside a value must be escaped, and \n in the string is the correct way to do it. This is the fault of the receiver's expectations.
A "real" newline would result in an invalid JSON document and most parsers will flat-out reject it.
On the receiving end JSON deserializer can deal with \n encoded strings. If your receiver requires newlines it's broken and needs repairing. If you can't repair it then you're committed to sending busted up, malformed JSON-ish data that's not actually JSON and your broker is fully justified in trashing it.

How can i create an array as a database with JSON files and use Javascript to update / save it

I am making a discord bot in Node.js mostly for fun to get better at coding and i want the bot to push a string into an array and update the array file permanently.
I have been using separate .js files for my arrays such as this;
module.exports = [
"Map: Battlefield",
"Map: Final Destination",
"Map: Pokemon Stadium II",
];
and then calling them in my main file. Now i tried using .push() and it will add the desired string but only that one time.
What is the best solution to have an array i can update & save the inputs? apparently JSON files are good for this.
Thanks, Carl
congratulations on the idea of writing a bot in order to get some coding practice. I bet you will succeed with it!
I suggest you try to split your problem into small chunks, so it is going to be easier to reason about it.
Step1 - storing
I agree with you in using JSON files as data storage. For an app that is intended to be a "training gym" is more than enough and you have all the time in the world to start looking into databases like Postgres, MySQL or Mongo later on.
A JSON file to store a list of values may look like that:
{
"values": [
"Map: Battlefield",
"Map: Final Destination",
"Map: Pokemon Stadium II"
]
}
when you save this piece of code into list1.json you have your first data file.
Step2 - reading
Reading a JSON file in NodeJS is easy:
const list1 = require('./path-to/list1.json');
console.log(list.values);
This will load the entire content of the file in memory when your app starts. You can also look into more sophisticated ways to read files using the file system API.
Step3 - writing
Looks like you know your ways around in-memory array modifications using APIs like push() or maybe splice().
Once you have fixed the memory representation you need to persist the change into your file. You basically have to write it down in JSON format.
Option n.1: you can use the Node's file system API:
// https://stackoverflow.com/questions/2496710/writing-files-in-node-js
const fs = require('fs');
const filePath = './path-to/list1.json';
const fileContent = JSON.stringify(list1);
fs.writeFile(filePath, fileContent, function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});
Option n.2: you can use fs-extra which is an extension over the basic API:
const fs = require('fs-extra');
const filePath = './path-to/list1.json';
fs.writeJson(filePath, list1, function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});
In both cases list1 comes from the previous steps, and it is where you did modify the array in memory.
Be careful of asynchronous code:
Both the writing examples use non-blocking asynchronous API calls - the link points to a decent article.
For simplicity sake, you can first start by using the synchronous APIs which is basically:
fs.writeFileSync
fs.writeJsonSync
You can find all the details into the links above.
Have fun with bot coding!

Can I [de]serialize a dictionary of dataframes in the arrow/js implementation?

I want to use Apache Arrow to send data from a Django backend to a Angular frontend. I want to use a dictionary of dataframes/tables as payload in messages. It's posssible with pyarrow to share data in this way between python microservices, but i cant find a way with the javascript implementation of arrow.
Is there a way to deserialize/serialize a dictionary with strings as keys and dataframes/tables as values in the javascript side with Arrow?
Yes, a variant of this is possible using the RecordBatchReader and RecordBatchWriter IPC primitives in both pyarrow and ArrowJS.
On the python side, you can serialize a Table to a buffer like this:
import pyarrow as pa
def serialize_table(table):
sink = pa.BufferOutputStream()
writer = pa.RecordBatchStreamWriter(sink, table.schema)
writer.write_table(table)
writer.close()
return sink.getvalue().to_pybytes()
# ...later, in your route handler:
bytes = serialize_table(create_your_arrow_table())
Then you can send the bytes in the response body. If you have multiple tables, you can concatenate the buffers from each as one large payload.
I'm not sure what functionality exists to write multipart/form-body responses in python, but that's probably the best way to craft the response if you want the tables to be sent with their names (or any other metadata you wish to include).
On the JavaScript side, you can read the the response either with Table.from() (if you have just one table), or the RecordBatchReader if you have more than one, or if you want to read each RecordBatch in a streaming fashion:
import { Table, RecordBatchReader } from 'apache-arrow'
// easy if you want to read the first (or only) table in the response
const table = await Table.from(fetch('/table'))
// or for mutliple tables on the same stream, or to read in a streaming fashion:
for await (const reader of RecordBatchReader.readAll(fetch('/table'))) {
// Buffer all batches into a table
const table = await Table.from(reader)
// Or process each batch as it's downloaded
for await (const batch of reader) {
}
}
You can see more examples of this in our tests for ArrowJS here:
https://github.com/apache/arrow/blob/3eb07b7ed173e2ecf41d689b0780dd103df63a00/js/test/unit/ipc/writer/stream-writer-tests.ts#L40
You can also see some examples in a little fastify plugin I wrote for consuming and producing Arrow payloads in node: https://github.com/trxcllnt/fastify-arrow

nodejs express stream from array

I'm building an app which i need to stream data to client, my data is simply an array of objects .
this is the for loop which makes the array
for(let i =0;i<files.length;i++){
try {
let file = files[i]
var musicPath = `${baseDir}/${file}`
let meta = await getMusicMeta(musicPath)
musics.push(meta)
}
right now I wait for the loop to finish it's works then I send the whole musics array to client, I want to use stream to send musics array one by one to client instead of waiting for the loop to finish
Use scramjet and send the stream straight to the response:
const {DataStream} = require("scramjet");
// ...
response.writeHead(200);
DataStream.fromArray(files)
// all the magic happens below - flow control
.map(file => getMusicMeta(`${baseDir}/${file}`))
.toJSONArray()
.pipe(response);
Scramjet will make use of your flow control and most importantly - it'll get the result out faster than any other streaming framework.
Edit: I wrote a couple lines of code to make this use case easier in scramjet. :)

createReadStream in Node.JS

So I used fs.readFile() and it gives me
"FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of
memory"
since fs.readFile() loads the whole file into memory before calling the callback, should I use fs.createReadStream() instead?
That's what I was doing previously with readFile:
fs.readFile('myfile.json', function (err1, data) {
if (err1) {
console.error(err1);
} else {
var myData = JSON.parse(data);
//Do some operation on myData here
}
}
Sorry, I'm kind of new to streaming; is the following the right way to do the same thing but with streaming?
var readStream = fs.createReadStream('myfile.json');
readStream.on('end', function () {
readStream.close();
var myData = JSON.parse(readStream);
//Do some operation on myData here
});
Thanks
If the file is enormous then yes, streaming will be how you want to deal with it. However, what you're doing in your second example is letting the stream buffer all the file data into memory and then handling it on end. It's essentially no different than readFile that way.
You'll want to check out JSONStream. What streaming means is that you want to deal with the data as it flows by. In your case you obviously have to do this because you cannot buffer the entire file into memory all at once. With that in mind, hopefully code like this makes sense:
JSONStream.parse('rows.*.doc')
Notice that it has a kind of query pattern. That's because you will not have the entire JSON object/array from the file to work with all at once, so you have to think more in terms of how you want JSONStream to deal with the data as it finds it.
You can use JSONStream to essentially query for the JSON data that you are interested in. This way you're never buffering the whole file into memory. It does have the downside that if you do need all the data, then you'll have to stream the file multiple times, using JSONStream to pull out only the data you need right at that moment, but in your case you don't have much choice.
You could also use JSONStream to parse out data in order and do something like dump it into a database.
JSONStream.parse is similar to JSON.parse but instead of returning a whole object it returns a stream. When the parse stream gets enough data to form a whole object matching your query, it will emit a data event with the data being the document that matches your query. Once you've configured your data handler you can pipe your read stream into the parse stream and watch the magic happen.
Example:
var JSONStream = require('JSONStream');
var readStream = fs.createReadStream('myfile.json');
var parseStream = JSONStream.parse('rows.*.doc');
parseStream.on('data', function (doc) {
db.insert(doc); // pseudo-code for inserting doc into a pretend database.
});
readStream.pipe(parseStream);
That's the verbose way to help you understand what's happening. Here is a more succinct way:
var JSONStream = require('JSONStream');
fs.createReadStream('myfile.json')
.pipe(JSONStream.parse('rows.*.doc'))
.on('data', function (doc) {
db.insert(doc);
});
Edit:
For further clarity about what's going on, try to think about it like this. Let's say you have a giant lake and you want to treat the water to purify it and move the water to a new reservoir. If you had a giant magical helicopter with a huge bucket then you could fly over the lake, put the lake in the bucket, add treatment chemicals to it, then fly it to its destination.
The problem of course being that there is no such helicopter that can deal with that much weight or volume. It's simply impossible, but that doesn't mean we can't accomplish our goal a different way. So instead you build a series of rivers (streams) between the lake and the new reservoir. You then set up cleansing stations in these rivers that purify any water that passes through it. These stations could operate in a variety of ways. Maybe the treatment can be done so fast that you can let the river flow freely and the purification will just happen as the water travels down the stream at maximum speed.
It's also possible that it takes some time for the water to be treated, or that the station needs a certain amount of water before it can effectively treat it. So you design your rivers to have gates and you control the flow of the water from the lake into your rivers, letting the stations buffer just the water they need until they've performed their job and released the purified water downstream and on to its final destination.
That's almost exactly what you want to do with your data. The parse stream is your cleansing station and it buffers data until it has enough to form a whole document that matches your query, then it pushes just that data downstream (and emits the data event).
Node streams are nice because most of the time you don't have to deal with opening and closing the gates. Node streams are smart enough to control backflow when the stream buffers a certain amount of data. It's as if the cleansing station and the gates on the lake are talking to each other to work out the perfect flow rate.
If you had a streaming database driver then you'd theoretically be able to create some kind of insert stream and then do parseStream.pipe(insertStream) instead of handling the data event manually :D. Here's an example of creating a filtered version of your JSON file, in another file.
fs.createReadStream('myfile.json')
.pipe(JSONStream.parse('rows.*.doc'))
.pipe(JSONStream.stringify())
.pipe(fs.createWriteStream('filtered-myfile.json'));

Categories