Node.js: Error handling when using pipe with streams - javascript

I am reading large CSV and transforming it to create multiple other csv's. Here is my relevant part of the code. How do i error handle each step in this case?
I have added a generic on('error') but that does not seem to get triggered always.
Does each chain in the below code need its own on('error')? or is there a more elegant way to do this.
await fs.createReadStream(m.path)
.pipe(csv.parse({delimiter: '\t', columns: true}))
.pipe(csv.transform((input) => {
//delete input['Date'];
console.log(input);
return input;
}))
.pipe(csv.stringify({header: true}))
.pipe(fs.createWriteStream(transformedPath))
.on('finish', () => {
console.log('finish....');
}).on('error', () => {
console.log('error.....');
});
Thanks.

The stream API require each "middleware" to have its own .on("error") handler.
However, NodeJS ships with a pipeline utility function for this use case (stream.pipeline). The pipeline handles errors and closes streams correctly.
Using a pipeline, the example code provided by the OP would be written as:
import stream from "stream";
import util from "util";
try {
await util
.promisify(stream.pipeline)(
fs.createReadStream(m.path),
csv.parse({delimiter: '\t', columns: true})
csv.transform((input) => {
//delete input['Date'];
console.log(input);
return input;
}),
csv.stringify({header: true}),
fs.createWriteStream(transformedPath)
);
console.log('finish....');
}
catch( error ){
console.log('error.....');
}

Related

proper import/export and binding for ESM/Node

I have some pre-ES6 code that was working fine with require-type syntax, and am trying to port this to ESM friendly methodology. I have much of the js to mjs coding ported, and am working on a dynamic file loading block, that is giving me fits. I've gotten through enough googling and debugging to realize my issue is now with the syntax of the export function in the "events" file, which I'm using default syntax to support.
First the calling file:
async function eventLoad(eventDir = './events/')
{
await fs.readdir(`./events/`,function (err, files){
if (err)
{
console.error(`${error}: Error loading event: ${err}`);
return;
}
else
{
const events = files.filter(file=>file.endsWith('.mjs'));
for( const file of events)
{
console.log(`event file name is ${file}`);
const {default: event} = import(join(`./events/`,`${file}`));
const eventName = file.split('.')[0];
dBot.on(eventName, event.bind(null,dBot));
console.log(`${success} Loaded event ${eventName}`);
}
}
})
}
eventLoad();
// please ignore the missing promise on the import...I'll be adding it shortly. :)
I get an error:
TypeError: Cannot read property 'bind' of undefined
my export in the other file declares like this:
export default async (client, message) =>
{
...
knowing from googled issues that much of the likelihood rests in the improper definition in the file that is exporting the fuction, I tried playing with the syntax:
async function ProcessMessages (client, message)
{
...
}
export {ProcessMessages as default };
but alas, no help. I'm sure the issue is properly handling the export syntax, but I'm learning this as I go, and would appreciate any help you could provide, thanks!!!
Update: (based on solution, here is my code changes in case anyone else wants to leverage)
async function eventLoad(eventDir = './events/')
{
let files;
try {
files = await promiseBasedReaddir(`${eventDir}`);
}
catch (err) {
console.error(`${error}: Error loading event: ${err}`);
return;
}
const events = files.filter(file=>file.endsWith('.mjs'));
for( const file of events)
{
console.log(`event file name is ${file}`);
import(`${eventDir}${file}`)
.then(function( {default: event} ){
const eventName = file.split('.')[0];
dBot.on(eventName, event.bind(null,dBot));
console.log(`${success} Loaded event ${eventName}`);
})
.catch(function(err){
console.error(`${error}: Error loading event: ${err}`);
return;
})
}
}
eventLoad();
On this line, you do a dynamic import:
const {default: event} = import(join(`./events/`,`${file}`));
import will always return a Promise. The Promise object doesn't have a .bind method, only functions do. You need to await or .then() import()'s return value. So you can't "ignore the missing promise on the import".
A second issue:
await fs.readdir(`./events/`, ...ยด
fs.readdir doesn't return a promise, so await-ing it is useless. It is functionally the same as await undefined;. The outer function will return before the callback has finished.
If you use a readdir function that returns a promise, you can simplify things a lot.
import { promises as fsPromises } from 'fs';
const { readdir: promiseBasedReaddir } = fsPromises;
// OR alternatively:
import { readdir } from 'fs';
import { promisify} from 'util';
const promiseBasedReaddir = promisify(readdir);
async function eventLoad(eventDir = './events/')
{
let files;
try {
files = await promiseBasedReaddir(`./events/`);
} catch (err) {
console.error(`${error}: Error loading event: ${err}`);
return;
}
const events = files.filter(file=>file.endsWith('.mjs'));
for( const file of events)
{
console.log(`event file name is ${file}`);
const {default: event} = await import(join(`./events/`,`${file}`));
const eventName = file.split('.')[0];
dBot.on(eventName, event.bind(null,dBot));
console.log(`${success} Loaded event ${eventName}`);
}
}
Promisify will transform a callback-using function to a promise-returning function. The fs module has promise-based functions, so you don't need to use Promisify if you just import those.

Passing parameters to Express middleware not working

I'm trying to create an input validation middleware using Express. My goal is to be able to pass 2 parameters to the middleware that validates client input. The problem is, after following multiple resources (including Express docs), my middleware seems to not be working.
// input validator middleware
export const validateInput = (schema: joi.ObjectSchema) => {
console.log('first console log');
return (req: Request, res: Response, next: NextFunction) => {
console.log('second console log');
const { error } = schema.validate(req.body);
if (error) {
const errors = error.details.map((err) => err.message);
next(new InvalidInput(errors));
}
next();
};
};
// middleware call
const commentSchema = joi
.object({
content: joi.string().alphanum().min(3).required(),
})
.options({ abortEarly: false });
export const validateCommentInput = () => {
validateInput(commentSchema);
};
After calling the middleware, I get to the "first console log", but never to the second, and my API just hangs there until I force stop. My solution otherwise would be to just pass req and next as parameters to a function validateInput(req, next, commentSchema);, but I'm not sure that's the proper way to do it. I also tried the async version with the same results.
Any help is greatly appreciated.
Your validateCommentInput function isn't returning the inner function.
The lack of curly braces in a lambda implies a return statement. However, using curly braces means you have to specify return.
So change this:
export const validateCommentInput = () => {
validateInput(commentSchema);
};
to this:
export const validateCommentInput = () => validateInput(commentSchema);

NodeJS: How to read from two files and write to single output file using pipes?

Context
I am using event-stream module to help me in reading and writing to these local files for which I hope to return a resulting file. Long story short, the 2 input files(sent through express API as multipart/form-data) I am expecting can be upwards of 200MB in size containing a list of entries (1 per line). What I would like to do is to combine those entries in the following format <entry1>:<entry2> where entry1 is the entry from the first file and entry2 is from the second file. I did this in a way earlier where I was able to store and return inputs/outputs in memory, but seeing as I have very limited memory space on my application server, I was running out of memory on the heap. I read that I could use event-stream and piping to read in each file line by line and output to a file instead of to a large string in memory using read-streams. The issue is that I can't seem to resolve in the right way/time in order for the resulting output file to be ready to send back to the caller.
What I have so far
What I have so far worked in that I get the correct file output I am expecting, however, this seems to be an asynchronicity problem, in that, I am resolving the promise before the file has actually completed writing/saving. Please see my code below...
const fs = require('fs');
const es = require('event-stream');
const uuid = require('uuid');
const buildFile = async (fileOne, fileTwo) =>
await new Promise((resolve, reject) => {
try {
// Output stream
let fileID = uuid.v4();
let outStream = fs
.createWriteStream(`files/outputFile-${fileID}.txt`, {
flags : 'a',
encoding : 'utf-8'
});
let fileOneRS = fs
.createReadStream(fileOne.path, {
flags : 'r',
encoding : 'utf-8'
})
.pipe(es.split())
.pipe(
es.mapSync((lineOne) => {
fileOneRS.pause();
let fileTwoRS = fs
.createReadStream(fileTwo.path, {
flags : 'r',
encoding : 'utf-8'
})
.pipe(es.split())
.pipe(
es.mapSync((lineTwo) => {
fileTwoRS.pause();
// Write combo to file
outStream.write(`${lineOne}:${lineTwo}\n`);
fileTwoRS.resume();
})
);
fileOneRS.resume();
})
); // This is where I have tried doing .on('end', () => resolve), but it also does not work :(
} catch (err) {
reject(err);
}
});
Note: This function is called from another service function as follows:
buildFile(fileOne, fileTwo)
.then((result) => {
resolve(result);
})
.catch((err) => {
console.log(err);
reject(err);
});
As a novice Javascript developer and even newer to NodeJS, I have been stuck trying to figure this out on my own for over 2 weeks now. If anyone is able to help, I would greatly appreciate some wisdom here!
Thanks ๐Ÿ™‚
Edit: Updated the code to conform to the OP's expected output.
The promise' resolve() function should be called once the write stream completes. The comment provided in the OP snippet indicate that the resolve function might have been called upon draining fileOneRS (at the end of the pipe() chain).
Rather than creating a new read stream for each line in the first file, the code should only instantiate the read streams once.
The following example illustrate how this code flow could be refactored to read each line only once, and concatenate the lines from file A and B line-by-line:
import stream from "stream";
import util from "util";
import readline from "readline";
import fs from "fs";
import os from "os";
/** Returns a readable stream as an async iterable over text lines */
function lineIteratorFromFile( fileStream ){
return readline.createInterface({
input: fileStream,
crlfDelay: Infinity
})
}
// Use stream.pipeline to handle errors and to stream the combined output
// to a Writable stream. The promise will resolve once the data has finished
// writing to the output stream.
await util
.promisify(stream.pipeline)(
async function*(){
for await ( const lineA of lineIteratorFromFile(fs.createReadStream( "./in1.txt" ))){
for await (const lineB of lineIteratorFromFile(fs.createReadStream( "./in2.txt" ))){
yield `${lineA}: ${lineB}${os.EOL}`
}
}
},
fs.createWriteStream( outputFile )
);
A runnable example with NodeJS v13+ is available in the collapsed snippet below:
// in1.txt:
foo1
foo2
// in2.txt:
bar1
bar2
// out.txt (the file created by this script, with expected output):
foo1: bar1
foo1: bar2
foo2: bar1
foo2: bar2
// main.mjs:
import stream from "stream";
import util from "util";
import readline from "readline";
import fs from "fs";
import os from "os";
/** Returns a readable stream as an async iterable over text lines */
function lineIteratorFromFile( fileStream ){
return readline.createInterface({
input: fileStream,
crlfDelay: Infinity
})
}
(async ()=>{
await util
.promisify(stream.pipeline)(
async function*(){
for await ( const lineA of lineIteratorFromFile(fs.createReadStream( "./in1.txt" ))){
for await (const lineB of lineIteratorFromFile(fs.createReadStream( "./in2.txt" ))){
yield `${lineA}: ${lineB}${os.EOL}`
}
}
},
fs.createWriteStream( "./out.txt" )
);
})()
.catch(console.error);

jasmine spy not finding property

I have a file that basically looks like this(shortened)
const octokit = new (require("#octokit/rest"))();
function buildRepo(name) {
fs.promises
.readFile("data/settings.json")
.then(data => JSON.parse(data))
.then(settings => settings.repositories.find(repo => repo.name === name))
.then(repo => {
let repoName = repo.url
.substring(repo.url.lastIndexOf("/") + 1)
.slice(0, -4);
let jobName = repo.name;
return octokit.repos
.get({
owner: "munhunger",
repo: repoName
})
.then(({ data }) => {
...
});
});
}
module.exports = { buildRepo };
And so I want to write a test for what it does with the data that it gets from the octokit.repos.get function. But since that function will go out to the internet and look at GitHub repositories, I want to mock it.
I have a few tests running with jasmine, and I read up slightly on it and it seems as if jasmine should be able to mock this for me.
However, the test that I have written seems to fail.
const builder = require("./index");
describe("spyOn", () => {
it("spies", () => {
spyOnProperty(builder, "octokit");
builder.buildRepo("blitzbauen");
});
});
With the error octokit property does not exist. What am I doing wrong here? would I need to add octokit to module.exports?(which seems rather insane)
Yes, you'd need to add Octokit to module.exports since you now only export buildRepo.
Anything from a module that is not exported can't be accessed directly by other modules, so if it should be accessible it should be exported.
Alternatively, you may be able to mock the entire Octokit module with Jasmine so any calls by any script are made to the mocked version, but I'm not sure how you'd go about doing that since my experience with Jasmine is limited

Promise.all error handling โ€” make result of one Promise accessible in another Promise's catch?

I'm writing a simple build script which compiles some files. The only problem remaining is the error handling. It works, but I want add additional content to the error messages. Here is a snippet of the code in question:
const promises = []
for (let file of files) {
promises.push(Promise.all([exec(compile(file)), Promise.resolve(file)]))
}
Promise.all(promises.map(p => p.catch(e => console.log(e))))
.then(result => {
/* result is now an array with the following pattern:
[[stdout, filename], [stdout, filename], ...]
*/
});
The exec function returns some stdout containing data which does not state which file was used. Therefore I have added a Promise.all containing both the exec function and a promise which immediately resolve and return the filename. I need the data returned from exec and the filename for when I need to write the files to the system. Because I still want the last then to run regardless of any errors, I handle the errors for each file individually (hence the .map). The only issue is that the stdout from exec does not reference the file it used. So the error messages get confusing. I would something like the following:
p.catch(e => console.log(`error happened in ${file}:`, e))
I'm not sure how I can access the file variable from within the catch. Any ideas?
You should directly add the catch when calling the function:
for (let file of files) {
promises.push(
exec(compile(file))
.then(result => [result, file])
.catch(error => [error, file])
);
}
Promise.all(promises).then(results => {
//...
});
You might want to put the catch in the loop where the respective file is still in scope:
Promise.all(files.map(file =>
Promise.all([
exec(compile(file)).catch(e => console.log(`error happened in ${file}:`, e)),
file
])
)).then(result => {
/* result is now an array with the following pattern:
[[stdout/undefined, filename], [stdout/undefined, filename], ...]
*/
});

Categories