Issues reading a small CSV file in Node

Issues reading a small CSV file in Node - javascript

ISSUE: I am trying to use Nodejs streams to read a small CSV file (1 row) using the fast-csv module.
The CSV 'rows' are pushed to an array(rows []) when the 'data' event is emitted. When 'end' is emitted, the data is update in a DB. However, the 'end' event is triggered before the rows[] array can be populated. This happens intermittently and sometimes the code works as intended.
My guess after reading the Nodejs docs is that this is due to the small size of the CSV file. The data is being read in the 'flowing' mode and as soon as the the first row is read, the 'end' even is triggered, which seems to happen before the record is pushed to the required array.
Tried using the 'paused' mode, but it didn't work.
I am new with Nodejs and not able to figure out how to make this function work. Any help or guidance would be appreciated.
CODE:
function updateToDb(filename, tempLocation) {
const rows = [];
const readStream = fs.createReadStream(tempLocation + '\\' + filename).pipe(csv.parse());
return new Promise((resolve, reject) => {
readStream.on('data', row => {
console.log('Reading');
rows.push(row);
})
.on('end', () => {
console.log('Completed');
let query = `UPDATE ${tables.earnings} SET result_date = CASE `;
rows.forEach(element => {
query += `WHEN isin = '${element[0]}' AND announcement_date = '${element[1]}' THEN '${element[2]}' ELSE result_date`;
});
query += ' END';
connection.query(query, (error, results) => {
if (error)
reject(error);
else
resolve(results.changedRows);
});
})
.on('error', error => {
reject(error);
});
});
}

Related

Watson Assistant context is not updated

I use watson assistant v1
My problem is that every time I make a call to the code in Nodejs, where I return the context, to have a coordinated conversation, the context is only updated once and I get stuck in a node of the conversation
this is my code
client.on('message', message => {
//general variables
var carpetaIndividual = <../../../>
var cuerpoMensaje = <....>
var emisorMensaje = <....>
//detect if context exists
if(fs.existsSync(carpetaIndividual+'/contexto.json')) {
var watsonContexto = require(carpetaIndividual+'/contexto.json');
var variableContexto = watsonContexto;
} else {
var variableContexto = {}
}
//conection with Watson Assistant
assistant.message(
{
input: { text: cuerpoMensaje },
workspaceId: '<>',
context: variableContexto,
})
.then(response => {
let messageWatson = response.result.output.text[0];
let contextoWatson = response.result.context;
console.log('Chatbot: ' + messageWatson);
//Save and create JSON file for context
fs.writeFile(carpetaIndividual+'/contexto.json', JSON.stringify(contextoWatson), 'utf8', function (err) {
if (err) {
console.error(err);
}
});
//Send messages to my application
client.sendMessage(emisorMensaje, messageWatson)
})
.catch(err => {
console.log(err);
});
}
client.initialize();
the context.json file is updated, but when it is read the code only reads the first update of the context.json file and not the other updates

This will be because you are using require to read the .json file. For all subsequent requires of an already-required file, the data is cached and reused.
You will need to use fs.readfile and JSON.parse
// detect if context exists
if (fs.existsSync(carpetaIndividual+'/contexto.json')) {
var watsonContexto = fs.readFileSync(carpetaIndividual+'/contexto.json');
// Converting to JSON
var variableContexto = JSON.parse(watsonContexto);
} else {
var variableContexto = {}
}
There is another subtle problem with your code, in that you are relying on
your async call to fs.writeFile completing before you read the file. This will be the case most of the time, but as you don't wait for the fs.writeFile to complete there is the chance that you may try to read the file, before it is written.

Why Node Pdfkit creates occasionally a corrupted file in my code?

I have a function that creates a pdf-file and sends it to email using pdfkit and nodemailer Every now or then I get a file I can't open. Can't figure out why this happens and why it works most of the time? I haven't noticed any certain situation when it fails, there doesn't seem to be any formula in it (text length etc). Could someone point out if there is some obvious problem in my pdf -creation code (like with async/await).
exports.sendTranscriptionToEmail = async (req, res) => {
let finalText = [];
let nickColorsArray = [];
const doc = new PDFDocument();
let filename = req.body.sessionName;
let text = [];
if (!filename || typeof filename != "string") {
return res.status(400).send({ message: "Incorrect Information!" });
}
// here I get the data from the database
try {
const rows = await knex("transcriptions").select("*").where({
conversation_id: filename,
});
if (!rows) {
return res.status(400).send({ message: "Transcription not found" });
}
// Stripping special characters
filename = encodeURIComponent(filename) + ".pdf";
res.setHeader(
"Content-disposition",
'attachment; filename="' + filename + '"'
);
res.setHeader("Content-type", "application/pdf");
doc.pipe(fs.createWriteStream(filename));
doc.fontSize(18).fillColor("black").text("Participants:", {
width: 410,
align: "center",
});
doc.moveDown();
nickColorsArray.forEach((n) => {
doc.fontSize(14).fillColor(n.color).text(n.nick, {
width: 410,
align: "center",
});
});
doc.moveDown();
doc.moveDown();
doc.fontSize(18).fillColor("black").text("Transcription:", {
width: 410,
align: "center",
});
doc.moveDown();
finalText.forEach((f) => {
doc
.fontSize(14)
.fillColor(f.color)
.text(f.word + " ", {
width: 410,
continued: true,
});
});
doc.end();
} catch (err) {
console.log("Something went wrong: ", err.message);
}

I have experienced the same issue, after investigation i found out my solution at https://github.com/foliojs/pdfkit/issues/265
PDFKit doesn't actually know when all of the data has been flushed to
whatever stream you're writing to (file, http response, etc.). Since
PDFKit has no access to the actual writable stream it is being piped
to (PDFKit itself is a readable stream, and you set up the writable
part), it only knows when it has finished pumping out chunks to
whoever might be reading. It may be some time later that the writable
stream actually flushes its internal buffers out to the actual
destination.
I believe it's not quite as simple as listening for the 'finish' event
on the write stream, specifically in the case of errors, so I
implemented the following function which returns a Promise.
function savePdfToFile(pdf : PDFKit.PDFDocument, fileName : string) : Promise<void> {
return new Promise<void>((resolve, reject) => {
// To determine when the PDF has finished being written successfully
// we need to confirm the following 2 conditions:
//
// 1. The write stream has been closed
// 2. PDFDocument.end() was called syncronously without an error being thrown
let pendingStepCount = 2;
const stepFinished = () => {
if (--pendingStepCount == 0) {
resolve();
}
};
const writeStream = fs.createWriteStream(fileName);
writeStream.on('close', stepFinished);
pdf.pipe(writeStream);
pdf.end();
stepFinished();
});
}
Instead of calling .end() directly, you call this function and pass
the pdf document and filename.
This should correctly handle the following situations:
PDF Generated successfully Error is thrown inside pdf.end() before
write stream is closed Error is thrown inside pdf.end() after write
stream has been closed
So in the end it seems that sometimes server does not create the file fast enough for you, after implementing this solution,my response time did increase by something like 1sec and got rid of this corruption behavior

node.js csv how to handle bad data

I have an application parsing csv file. I used csv module and it basically works fine. However, once there is a bad row in the csv file, the whole process fails.
Is there anyway to skip bad rows and resume streaming after catching an error?
This is a simple example
var csv = require('csv');
var stream = require('stream');
var parser = csv.parse({ delimiter: "," });
parser.on("data", (chunk) => {
console.log("one chunk");
chunk.forEach((datum) => {
console.log("data: ", datum);
});
});
parser.on("error", (err) => {
// Skip the error and resume stream here
console.log("one error: ", err.message);
});
var test = "00,01,02,03\n10,11,12,23\n21,22,\n30,31,32,33";
var rs = new stream.Readable();
rs._read = () => {};
rs.push(test);
rs.pipe(parser);
Here the third row has only three columns while other rows have four. I want to catch the error and write out all other rows. Is there any good strategy to do this? Using some function or option in csv module will be perfect.

Well there are two things here.
The first one is that you can use the relax_column_count: true in csv.parse options, and it should be working.
But if you test it you will see that the last line is missing. In fact in the way you pass your stream even with a proper csv string it would fail! Although if you pass a proper csv file it will work, so I suspect that there is something wrong with the stream also.
So to sum up this is the code.
var csv = require('csv');
var parser = csv.parse({ relax_column_count:true, delimiter: "," });
parser.on("data", (chunk) => {
console.log("one chunk");
chunk.forEach((datum) => {
console.log("data: ", datum);
});
});
parser.on("error", (err) => {
// Skip the error and resume stream here
console.log("one error: ", err.message);
})
parser.on('close',function(){
console.log(parser)
})
require('fs').createReadStream('test.csv').pipe(parser);
And in test.csv
00,01,02,03
10,11,12,23
21,22,23,24
30,31,
34,35,36,37
As requested here is the code working with stream
var csv = require('csv');
var stream = require('stream');
var parser = csv.parse({ relax_column_count:true, delimiter: "," });
parser.on("data", (chunk) => {
console.log("one chunk");
chunk.forEach((datum) => {
console.log("data: ", datum);
});
});
parser.on("error", (err) => {
// Skip the error and resume stream here
console.log("one error: ", err.message);
})
parser.on('close',function(){
console.log(parser)
})
var test = "00,01,02,03\n10,11,12,23\n21,22,\n30,31,32,33"
const myReadable = new stream.Readable({
read(size) {
this.push(test)
test = null
}
});
myReadable.pipe(parser);
I believe the problem with your stream was that you didn't push null in the end and it didn't end it in a good manner.

Request ends before readstream events are handled

I'm attempting to make a nodejs function which reads back data from a file with the following code:
app.post('/DownloadData', function(req, res)
{
req.on('data', function(data) {
if (fs.existsSync('demoDataFile.dat')) {
var rstream = fs.createReadStream('demoDataFile.dat');
var bufs = [];
rstream.on('data', function(chunk) {
bufs.push(chunk);
console.log("data");
});
rstream.on('end', function() {
downbuf = Buffer.concat(bufs);
console.log(downbuf.length);
});
}
});
req.on('end', function() {
console.log("end length: " + downbuf.length);
res.end(downbuf);
});
req.on('error', function(err)
{
console.error(err.stack);
});
});
The problem is, the buffer comes back as empty as the req.on('end' ... is called before any of the rstream.on events ("data" and the length aren't printed in the console until after "end length: " has been printed). Am I handling the events wrong or is there some other issue? Any guidance would be appreciated.

Not sure why you're reading from req, because you're not using the body data at all. Also, because the data event can trigger multiple times, the code you're using to read the file may also get called multiple times, which probably isn't what you want.
Here's what I think you want:
app.post("/DownloadData", function(req, res) {
let stream = fs.createReadStream("demoDataFile.dat");
// Handle error regarding to creating/opening the file stream.
stream.on('error', function(err) {
console.error(err.stack);
res.sendStatus(500);
});
// Read the file data into memory.
let bufs = [];
stream.on("data", function(chunk) {
bufs.push(chunk);
console.log("data");
}).on("end", function() {
let downbuf = Buffer.concat(bufs);
console.log(downbuf.length);
...process the buffer...
res.end(downbuf);
});
});
You have to be aware that this will read the file into memory entirely. If it's a big file, it may require a lot of memory.
Since you don't specify which operations you have to perform on the file data, I can't recommend an alternative, but there are various modules available that can help you process file data in a streaming fashion (i.e. without having to read the file into memory entirely).

Node.js file write in loop fails randomly

Here is my code :
function aCallbackInLoop(dataArray) {
dataArray.forEach(function (item, index) {
fs.appendFile(fileName, JSON.stringify(item) + "\r\n", function (err) {
if (err) {
console.log('Error writing data ' + err);
} else {
console.log('Data written');
}
});
});
}
I get random errors :
Data written
Data written
.
.
Error writing data Error: UNKNOWN, open 'output/mydata.json'
Error writing data Error: UNKNOWN, open 'output/mydata.json'
.
.
Data written
Error writing data Error: UNKNOWN, open 'output/mydata.json'
The function (aCallbackInLoop) is a callback for a web-service request, which returns chunks of data in dataArray. Multiple web-service requests are being made in a loop, so this callback is perhaps being called in parallel. I doubt it's some file lock issue, but I am not sure how to resolve.
PS: I have made sure it's not a data issue (I am logging all items in dataArray)
Edit : Code after trying write stream :
function writeDataToFile(fileName, data) {
try {
var wStream = fs.createWriteStream(fileName);
wStream.write(JSON.stringify(data) + "\r\n");
wStream.end();
} catch (err) {
console.log(err.message);
}
}
function aCallbackInLoop(dataArray){
dataArray.forEach(function(item, index){
writeDataToFile(filename, item); //filename is global var
});
}

As you have observed, multiple appendFile calls are not able to proceed because of the previous appendFile calls. In this particular case, it would be better to create a write stream.
var wstream = fs.createWriteStream(fileName);
dataArray.forEach(function (item) {
wstream.write(JSON.stringify(item + "\r\n");
});
wstream.end();
If you want to know when all the data is written, then you can register a function with the finish event, like this
var wstream = fs.createWriteStream(fileName);
wstream.on("finish", function() {
// Writing to the file is actually complete.
});
dataArray.forEach(function (item) {
wstream.write(JSON.stringify(item + "\r\n");
});
wstream.end();

Try using the synchronous version of appendFile - https://nodejs.org/api/fs.html#fs_fs_appendfilesync_filename_data_options

We Keep Coding

JavaScript is the programming language of the Web.

Issues reading a small CSV file in Node - javascript

Related

Watson Assistant context is not updated

Why Node Pdfkit creates occasionally a corrupted file in my code?

node.js csv how to handle bad data

Request ends before readstream events are handled

Node.js file write in loop fails randomly

Categories

Resources