I am new to js and trying to parse a CSV in the backend using node.js.
I have an array of states in which I want to store the data of a column of CSV. This is a very simple code that i wrote using fast-csv to do so. But whenever I run the code, I get an empty array, [] . I tried doing the same using papaparse and got the same results.
const csv = require('fast-csv')
const file = fs.createReadStream('main.csv');
var states = []
file
.pipe(csv.parse({ headers: false }))
.on('data', row => states.push(row[2]))
console.log(states)
But whenever I console log it in then .on('end') block the values are logged.
const csv = require('fast-csv')
const file = fs.createReadStream('main.csv');
var states = []
file
.pipe(csv.parse({ headers: false }))
.on('data', row => states.push(row[2]))
.on('end', () => console.log(states) // This works
console.log(states) // This doesn't
I think this is due to the parser working asynchronously. I have tried to resolve promises and used async/await methods but I cant use the parsed content in the global scope.
Would love some help on this one.
You are right, parser works asynchronously. So, it starts parsing and uses callback on events ('data', 'end' in your case). But code after your parser will be executed emediately after parser started to work. So all your actions with parsed data should be done in 'end' event callback.
// function to start parser
const startParsing = (res) => {
const csv = require('fast-csv')
const file = fs.createReadStream('main.csv');
// you may use const cos it's type won't be changed
const states = []
file
.pipe(csv.parse({ headers: false }))
.on('data', row => states.push(row[2]))
// execute function after parsing.
.on('end', () => outputData(states, res));
};
const outputData = (states, res) => {
// your next actions here
console.log(states);
// for example, res.send(states) or anything to complete server response if it's accessible
res.send(states);
};
// then, in any place you need use this startParsing function, for example
server.get('/parser', (req, res) => {
startParsing(res);
});
Related
This question already has answers here:
Using async/await with a forEach loop
(33 answers)
Use async await with Array.map
(9 answers)
Closed 27 days ago.
In an async IIFE at the bottom of this javascript, you'll see that I'm trying to: 1) read a JSON file, 2) get multiple RSS feed URLs from that data, 3) pull and parse the data from those feeds, and create an object with that data, so I can 4) write that pulled RSS data object to a JSON file. Everything for #1 and #2 is fine. I'm able to pull data from multiple RSS feeds in #3 (and log it to console), and I'm comfortable handling #4 when I get to that point later.
My problem is that, at the end of step #3, within the const parseFeed function, I am trying to create and push an object for that iteration of rssJSONValsArr.map() in the IIFE and it's not working. The rssFeedDataArr result is empty. Even though I am able to console.log those values, I can't create and push the new object I need in order to reach step #4. My creating of a similar object in #2 works fine, so I think it's the map I have to use within const parseFeed to pull the RSS data (using the rss-parser npm package) which is making object creation not work in step #3. How do I get rssFeedOject to work with the map data?
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
import Parser from 'rss-parser';
const parser = new Parser();
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const feedsJSON = path.join(__dirname, 'rss-feeds-test.json');
const rssJSONValsArr = [];
const rssFeedDataArr = [];
const pullValues = (feedObject, i) => {
const url = feedObject.feed.url;
const jsonValsObject = {
url: url,
};
rssJSONValsArr.push(jsonValsObject);
};
const parseFeed = async (url) => {
try {
const feed = await parser.parseURL(url);
feed.items.forEach((item) => {
console.log(`title: ${item.title}`); // correct
});
const rssFeedOject = {
title: item.title,
};
rssFeedDataArr.push(rssFeedOject);
} catch (err) {
console.log(`parseFeed() ERROR 💥: ${err}`);
}
};
(async () => {
try {
console.log('1: read feeds JSON file');
const feedsFileArr = await fs.promises.readFile(feedsJSON, {
encoding: 'utf-8',
});
const jsonObj = JSON.parse(feedsFileArr);
console.log('2: get feed URLs');
jsonObj.slice(0, 30).map(async (feedObject, i) => {
await pullValues(feedObject, i);
});
console.log('rssJSONValsArr: ', rssJSONValsArr); // correct
console.log('3: pull data from rss feeds');
rssJSONValsArr.map(async (feedItem, i) => {
await parseFeed(feedItem.url, i);
});
console.log('rssFeedDataArr: ', rssFeedDataArr); // empty !!!
// console.log('4: write rss data to JSON file');
// await fs.promises.writeFile(
// `${__dirname}/rss-bulk.json`,
// JSON.stringify(rssFeedDataArr)
// );
console.log('5: Done!');
} catch (err) {
console.log(`IIFE CATCH ERROR 💥: ${err}`);
}
})();
Example JSON file with two RSS feed URLs:
[
{
"feed": {
"details": {
"name": "nodejs"
},
"url": "https://news.google.com/rss/search?q=nodejs"
}
},
{
"feed": {
"details": {
"name": "rss-parser"
},
"url": "https://news.google.com/rss/search?q=rss-parser"
}
}
]
Any and all help appreciated. Thanks
The problem is you are printing rssFeedDataArr right after the .map call, which, like stated on the comments, is being incorrectly used, since you are not using the returned value, forEach would be the way to go here. For every value in rssJSONValsArr you are calling an anonymous and async function which in turn awaits for parseFeed, so you are basically creating a Promise in each iteration, but obviously those promises are resolved after your print statement is executed. You need to wait for all of those promises to be resolved before printing rssFeedDataArr. One way to do that, since you are creating a bunch of promises which can be run in parallel is to use Promise.all, like this:
await Promise.all(
rssJSONValsArr.map(async (feedItem, i) => {
await parseFeed(feedItem.url, i);
});
)
and you we can simplify it even more and return the promise created by parseFeed directly:
await Promise.all(
rssJSONValsArr.map((feedItem, i) => parseFeed(feedItem.url, i))
)
And in this case the right method is map and not forEach
In the case of rssJSONValsArr it works because the call to pullValues is being resolved instantly, it doesnt run asynchronously, even when its declared as async, there is not await inside the function definition.
Im using csv-parser npm package and doing a sample csv parse. My only confusion is accessing the parsed array after running these functions. I understand im pushing the data in .on('data') , then doing a console.log(results); statement in .on('end'); to show what's being stored. Why do I get undefined when i try to access results after running those functions. Doesn't results get the information stored?
const csv = require('csv-parser');
const fs = require('fs');
const results = [];
fs.createReadStream('demo.csv')
.pipe(csv())
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(results);
});
I came here to find the solution to the same issue.
Since this is an async operation, what works here is to call that function that acts on your parsed data once the end handler is called. Something like this should work in this situation:
const csv = require('csv-parser');
const fs = require('fs');
const results = [];
fs.createReadStream('demo.csv')
.pipe(csv())
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(results);
csvData(results);
});
const csvData = ((csvInfo) => {
console.log(csvInfo);
console.log(csvInfo.length);
})
I can get results in .on('end', () => { console.log(results);}); , but
if I put a console.log() after the createReadStream , results is
undefined, does that make sense? – Videoaddict101
Your stream acts asynchronously, that means your data and your end handler will be called later, meanwhile your javascript continue to be executed. So accessing your array just after fs.createReadStream instruction will result of an empty array.
Understanding async is very important using javascript, even more for nodejs.
Please have a look on differents resources for handling async like Promise, Async/Await ...
You should you neat-csv which is the endorsed wrapper for csv-parser that gives you a promise interface.
That said, you can create a promise and resolve it in the on("end", callback)
import fs from "fs";
import csv from "csv-parser";
function getCsv(filename) {
return new Promise((resolve, reject) => {
const data = [];
fs.createReadStream(filename)
.pipe(csv())
.on("error", (error) => reject(error))
.on("data", (row) => data.push(row))
.on("end", () => resolve(data));
});
}
console.log(await getCsv("../assets/logfile0.csv"));
I'm trying to use the async/await functionality to build a node JS script. I currently have a file called repo.js as a helper file to get data from Github's API and return it to a variable for me to access elsewhere in different JS files of my node application, repo.js is as such:
const axios = require('axios')
const repo = async () => {
const data = await axios.get('https://api.github.com/repos/OWNER/REPO/releases', {
headers: {
'Authorization': 'token MYTOKEN'
}
})
return data
}
exports.repo = repo
And then in my main.js file I'm trying to do...
const repo = require('./src/utils/repo')
program
.option('-d, --debug', 'output extra debugging')
.option('-s, --small', 'small pizza size')
.option('-p, --pizza-type <type>', 'flavour of pizza')
const repoData = repo.repo
console.log(repoData)
Unfortunately, this just returns [AsyncFunction: repo] to the console which isn't the intended behaviour. Why can't I access the contents here?
UPDATE
Based on some responses I've been given, I'm aware of the fact I need my code inside of a async function or to use .then(). The issue is, I don't want to put all of my application's code inside of a .then() just to rely on one thing from an API.
Example:
var version = ''
repo.getRepoDetails().then((res) => {
version = res.data[0].body.tag_name
})
Now I have access to version everywhere.
Every async/await function is a promise, meaning that you need to wait for it to finish in order to read it's result.
repo.repo().then(res => console.log(res))
If you application is a simple nodejs script(or single file) then you can wrap your code inside an IIFE like this:
const repo = require('./src/utils/repo')
(async () => {
program
.option('-d, --debug', 'output extra debugging')
.option('-s, --small', 'small pizza size')
.option('-p, --pizza-type <type>', 'flavour of pizza')
const repoData = await repo.repo() <--- You can use await now instead of then()
console.log(repoData)
})()
Async function always return promise object so you can access the result using promise.then() like
repo.repo().then(result => result)
I'm trying to fetch data from an S3 object and put it into an array. I plan to map through this array and display the data on a React front end in grid/list whatever. I'm struggling with nested functions though, so I'd appreciate some help.
const dataFromS3 = async (bucket, file) => {
let lines = [];
const options = {
Bucket: bucket,
Key: file
};
s3.getObject(options, (err, data) => {
if (err) {
console.log(err);
} else {
let objectData = data.Body.toString('utf-8');
lines.push(objectData);
console.log(lines);
return lines;
}
});
};
Formatting is a bit weird but this is my function to get data from s3. I want to take the output of this function in the form of an array and pass it to my '/' route which I'm testing:
app.get('/', async (req, res, next) => {
try {
let apolloKey = await dataFromS3(s3Bucket, apolloKeywords);
res.send(apolloKey);
} catch (err) {
console.log('Error: ', err);
}
});
It seems that the return value of lines in the s3.getObject function needs to be returned within the first function so that I can access it in app.get but I can't seem to do it after some attempts. The value in lines turns into an empty array if I return it at the end of datafromS3() and I can't find a way to return it. I've tried using promises also using a method found here - How to get response from S3 getObject in Node.js? but I get a TypeError: Converting Circular Structure to JSON...
Thank you
You need to make your dataFromS3 func like htis. You were not returning anything from that. AWS also provided promise based function.
const dataFromS3 = async (bucket, file) => {
const lines = [];
const options = {
"Bucket": bucket,
"Key": file
};
const data = await s3.getObject(options).promise();
const objectData = data.Body.toString("utf-8");
lines.push(objectData); // You might need to conversion here using JSON.parse(objectData);
console.log(lines);
return lines;
};
In the following code, I'm reading some files and getting their filename and text. After that, I'm storing data in an option variable to generate an epub file:
const Epub = require("epub-gen")
const folder = './files/'
const fs = require('fs')
let content = []
fs.readdir(folder, (err, files) => {
files.forEach(filename => {
const title = filename.split('.').slice(0, -1).join('.')
const data = fs.readFileSync(`${folder}${filename}`).toString('utf-8')
content.push({ title, data })
})
})
const option = {
title: "Alice's Adventures in Wonderland", // *Required, title of the book.
content
}
new Epub(option, "./text.epub")
The problem is, new Epub runs before the files are read, before content is ready. I think Promise.all is the right candidate here. I checked the Mozilla docs. But it shows various promises as example, but I have none. So, I'm not very sure how to use Promise.all here.
Any advice?
Your problem is with readdir, which is asynchronous so new Epub, like you already figured out, is called before it's callback parameter.
Switch to using readdirSync or move const option ... new Epub... inside the callback parameter of readdir, after files.forEach.
At the moment you can do everything synchronous since you use readFileSync.
So you can place the Epub creation after the forEach loop.
If you want to go async, my first question would be:
Does your node.js version support util.promisify ( node version 8.x or higher iirc )?
If so, that can be used to turn the callback functions like readFile and such into promises. If not, you can use the same logic, but then with nested callbacks like the other solutions show.
const FS = require( 'fs' );
const { promisify } = require( 'util' );
const readFile = promisify( FS.readFile );
const readFolder = promisify( FS.readFolder );
readFolder( folder )
// extract the file paths. Maybe using `/${filename}` suffices here.
.then( files => files.map( filename => `${folder}${filename}`))
// map the paths with readFile so we get an array with promises.
.then( file_paths => file_paths.map( path => readFile( path )))
// fecth all the promises using Promise.all() .
.then( file_promises => Promise.all( file_promises ))
.then( data => {
// do something with the data array that is returned, like extracting the title.
// create the Epub objects by mapping the data values with their titles
})
// error handling.
.catch( err => console.error( err ));
Add promises to an array. Each promise should resolve with the value you were pushing into content
When all promises resolve, the returned value will be the array previously known as content.
Also, you can, and should, use all async fs calls. So readFileSync can be replaced with readFile (async). I did not replace your code with this async call however, so you can clearly see what was required to answer your original question.
Not sure if I got the nesting right in snippet.
const Epub = require("epub-gen")
const folder = './files/'
const fs = require('fs')
let promises = []
fs.readdir(folder, (err, files) => {
files.forEach(filename => {
promises.push(new Promise((resolve, reject) => {
const title = filename.split('.').slice(0, -1).join('.')
const data = fs.readFile(`${folder}${filename}`).toString('utf-8')
resolve({
title,
data
})
}))
})
})
const option = {
title: "Alice's Adventures in Wonderland", // *Required, title of the book.
content
}
new Epub(option, "./text.epub")
Promise.all(promises).then((content) => {
//done
})