Refactoring Express code and cannot figure out variable scope - javascript

I hope this is not a silly question. I have a homepage route that loads up a lot of mongo databases, and I originally had it loop through the mongo databases and add them to an array that was rendered to a webpage. However, the databases have become more complicated and they need to be populated, so I can no longer use a loop to accomplish this, and needed to refactor to getting the databases individually. However, I seem to be having problems with the variable's scopes, as they are always returned as empty outside of the .find function.
My original code was this:
const collections = [User, Ticket, Client, Job, Transaction];
let endCollections = [];
for (let i = 0; i < collections.length; i++){
await collections[i].find({}, function(err, foundCollection){
if (err) {
console.log(err);
} else {
endCollections[i] = foundCollection;
}
});
}
res.render("dashboard", {transactions: endCollections[4], clients: endCollections[2], tickets: endCollections[1], jobs: endCollections[3]});
And this worked fine. But I need to populate the individual databases, so this was no longer useful. I rewrote it out to populate, but I am having problems changing the global variables inside of the functions. Here is the new code I am trying:
let transactions = [],
clients = [],
jobs = [],
tickets = [];
await Transaction.find({}).populate("job").populate("client").populate("deposited_by_user").exec(function(err, foundTransactions){
if(err){
console.log(err)
} else {
for (let i = 0; i < foundTransactions.length; i++){
foundTransactions[i]["transaction_info"]["new_amount"] = numberWithCommas(foundTransactions[i]["transaction_info"]["amount"]);
}
}
transactions = foundTransactions;
});
await Client.find({}).populate("transactions").populate("jobs").exec(function(err, foundClients){
if(err){
console.log(err)
}
clients = foundClients;
});
await Ticket.find({}).populate("created_by").populate("assigned_user").populate("completed_by_user").exec(function(err, foundTickets){
if(err){
console.log(err)
}
tickets = foundTickets;
});
await Job.find({}).populate("created_by").populate("client").populate("transactions").exec(function(err, foundJobs){
if(err){
console.log(err)
}
jobs = foundJobs;
});
res.render("dashboard", {transactions: transactions, clients: clients, tickets: tickets, jobs: jobs});
For example, if I console.log "jobs" right after the line jobs = foundJobs;, it will show the jobs array being populated. However, if I console.log "jobs" right before the res.render, it shows it as empty. Considering the global variable endCollections in my original code seemed to be changed within the functions before, I am unsure why my new code does not do the same as everything is returned empty. I know that somehow the scope of the variable is what is wrong here, but I cannot see how. Is there something obvious I am missing? Thanks.

Here now the answer so it is not buried in the post's comments.
After reading the docs, I think you should either use await with an empty exec() or use exec(callback).
What happens when you use both is that exec(callback) sees u passed a callback, it asynchronously executes your query and adds the callback to the promise.then of the query promise to be called once the query promise is settled. Then it immediately returns but it does not return the query promise since you passed a callback. The await is simply awaiting the normal (probably void/undefined) return of the function which is why removing it does not change anything.
After awaiting the return of the function, res.render executes and some time after that, the promise that had been created in the exec(callback) call settles and the callback you passed is executed.
So what is the appropriate way of fixing this? I would encourage you to read deeper into async/awai, promises, and the docs I linked above and find it out yourself before you read on, but since the solution is quite simple I'll leave it here.
// your variable declarations
try {
const foundTransactions = await Transaction.find({}).populate("job").populate("client").populate("deposited_by_user").exec();
// your for loop
transactions = foundTransactions;
// same for the other calls
tickets: tickets, jobs: jobs});
catch (e) {console.log(e);}
res.render("dashboard", {transactions: transactions, clients: clients,...

Related

Sequential async operations for each in Node JS

I am working on a database migration. This requires querying one DB, getting an array of records and perform a set of async operations to insert the data in the new DB. In order to keep data consistency, I want to insert the records one at the time so I want each operation to run sequentially. The only way I have found to do these is using recursion.
Is there a cleaner way of doing this same thing? I know there is a library called async https://caolan.github.io/async/v3/ which I never tried before.
The recursive method I have written looks like this:
const insertItem = async (data) => {
let item = data[0];
if (!item) {
//I am done return
return;
}
try {
//Do multiple await calls to insert record into new database
} catch (e) {
//Recover from error (DB rollbacks, etc)
} finally {
//Remove inserted or failed item from collection.
data.shift();
await insertItem(data);
}
};
//Query original database
getInfo().then((data) => insertItem(data));
You can use sync loop for...of it will wait for HTTP response.
const dataArr = ['data1', 'data2', 'data3'];
async function processItems(arr){
for(const el of arr) {
const response = await insertData(el);
// add some code here to process the response.
}
};
processItems(dataArr);
The posted code achieves iterating through the data collection (an array) retrieved from the first data base by using data.shift to mutate the argument array before calling the single function handling everything recursively.
To clean this up, remove the shift and recursive calls by splitting the data processing function into two:
one function to step through data retrieved from the first data base,
a second function to insert records in the second data base as required.
This removes the need for the .finally clause and leaves the structure of the code looking more like
async function insertData(data) {
for( let index = 0 ; index < data.length; ++index) {
await insertItem( data[index]);
}
}
async function insertItem( data) {
try {
//Do multiple await calls to insert record into new database
} catch (e) {
//Recover from error (DB rollbacks, etc)
// throwing an error here aborts the caller, insertData()
}
}
getInfo().then( insertData).catch( /*... handle fatal error ...*/);
Depending on preferred style, insertItem could be declared as nested function within insertData to keep it looking neat, and insertData could be written as an anonymous function argument of the then call after getInfo().
It is possible, of course, to perform asynchronous operations sequentially be other means, but using await inside an async function is perhaps the simplest coding method.

Get Value out of Callback

I have a callback, where I need to get the array out of my callback.
I am trying to return the awaited array into the predefined one.
let documents = [];
socket.on('echo', async function(documents, data){
documents = await data;
console.log(documents); // shows me, what I want to see
});
console.log(documents); // empty Array
I need the result in my predefined Array documents
I have read several Tuts, but I dont get it. I know on stackoverflow it is sked several times. But all threads seem to be more complex then my situation. So I hope to get it cleared out with an more incomplex one.
You need to understand something first. When this runs what is inside the callback doesn't run unit the server will emit that event, in your case 'echo'.
What I think you want to do is use documents outside the callback. You can create a function and call it when the event is emitted.
Something like this:
const manageDocuments = (documents) => {
// do what you want with documents array
console.log(documents);
}
socket.on('echo', async function(documents, data){
documents = await data;
manageDocuments(documents);
});
Of course you can also get rid of the async/await
let documents = await socket.on('echo', async function(documents, data){
console.log(documents);
return data;
});
console.log(documents);
The problem is that the code outside the socket function executes with the empty array because is only executed once at runtime.
If you want to have access to the documents inside the socket function you have to make then persist, or use the socket.on inside of another loop.

why my javascript its use to long time to run?

I'm working with Cloud Functions for Firebase, and I get a timeout with some of my functions. I'm pretty new with JavaScript. It looks like I need to put a for inside a promise, and I get some problems. The code actually get off from for too early, and I think he make this in a long time. Do you have some way to improve this code and make the code faster?
exports.firebaseFunctions = functions.database.ref("mess/{pushId}").onUpdate(event => {
//first i get event val and a object inside a firebase
const original = event.data.val();
const users = original.uids; // THIS ITS ALL USERS UIDS!!
// so fist i get all users uids and put inside a array
let usersUids = [];
for (let key in users) {
usersUids.push(users[key]);
}
// so now i gonna make a promise for use all this uids and get token's device
//and save them inside a another child in firebase!!
return new Promise((resolve) => {
let userTokens = [];
usersUids.forEach(element => {
admin.database().ref('users/' + element).child('token').once('value', snapShot => {
if (snapShot.val()) { // if token exist put him inside a array
userTokens.push(snapShot.val());
}
})
})
resolve({
userTokens
})
}) // now i make then here, from get userTokens and save in another child inside a firebase database
.then((res) => {
return admin.database().ref("USERS/TOKENS").push({
userTokens: res,
})
})
})
You are making network requests with firebase, so maybe that's why it's slow. You are making one request per user, so if you have 100 ids there, it might as well take a while.
But there's another problem that I notice, that is: you are just resolving to an empty list. To wait for several promises, create an array of promises, and then use Promise.all to create a promise that waits for all of them in parallel.
When you call resolve, you have already done the forEach, and you have started every promise, but they have not been added to the list yet. To make it better, chance it to a map and collect all the returned promises, and then return Promise.all.

Duplicate Array Data Web Scraping

I can't seem to get the article duplicates out of my web scraper results, this is my code:
app.get("/scrape", function (req, res) {
request("https://www.nytimes.com/", function (error, response, html) {
// Load the HTML into cheerio and save it to a variable
// '$' becomes a shorthand for cheerio's selector commands, much like jQuery's '$'
var $ = cheerio.load(html);
var uniqueResults = [];
// With cheerio, find each p-tag with the "title" class
// (i: iterator. element: the current element)
$("div.collection").each(function (i, element) {
// An empty array to save the data that we'll scrape
var results = [];
// store scraped data in appropriate variables
results.link = $(element).find("a").attr("href");
results.title = $(element).find("a").text();
results.summary = $(element).find("p.summary").text().trim();
// Log the results once you've looped through each of the elements found with cheerio
db.Article.create(results)
.then(function (dbArticle) {
res.json(dbArticle);
}).catch(function (err) {
return res.json(err);
});
});
res.send("You scraped the data successfully.");
});
});
// Route for getting all Articles from the db
app.get("/articles", function (req, res) {
// Grab every document in the Articles collection
db.Article.find()
.then(function (dbArticle) {
res.json(dbArticle);
})
.catch(function (err) {
res.json(err);
});
});
Right now I am getting five copies of each article sent to the user. I have tried db.Article.distinct and various versions of this to filter the results down to only unique articles. Any tips?
In Short:
Switching the var results = [] from an Array to an Object var results = {} did the trick for me. Still haven't figured out the exact reason for the duplicate insertion of documents in database, will update as soon I find out.
Long Story:
You have multiple mistakes and points of improvement there in your code. I will try pointing them out:
Let's follow them first to make your code error free.
Mistakes
1. Although mongoose's model.create, new mongoose() does seem to work fine with Arrays but I haven't seen such a use before and it does not even look appropriate.
If you intend to create documents one after another then represent your documents using an object instead of an Array. Using an array is more mainstream when you intend to create multiple documents at once.
So switch -
var results = [];
to
var results = {};
2. Sending response headers after they are already sent will create for you an error. I don't know if you have already noticed it or not but its pretty much clear upfront as once the error is popped up the remaining documents won't get stored because of PromiseRejection Error if you haven't setup a try/catch block.
The block inside $("div.collection").each(function (i, element) runs asynchronously so your process control won't wait for each document to get processed, instead it would immediately execute res.send("You scraped the data successfully.");.
This will effectively terminate the Http connection between the client and the server and any further issue of response termination calls like res.json(dbArticle) or res.json(err) will throw an error.
So, just comment the res.json statements inside the .create's then and catch methods. This will although terminate the response even before the whole articles are saved in the DB but you need not to worry as your code would still work behind the scene saving articles in database for you (asynchronously).
If you want your response to be terminated only after you have successfully saved the data then change your middleware implementation to -
request('https://www.nytimes.com', (err, response, html) => {
var $ = cheerio.load(html);
var results = [];
$("div.collection").each(function (i, element) {
var ob = {};
ob.link = $(element).find("a").attr("href");
ob.title = $(element).find("a").text();
ob.summary = $(element).find("p.summary").text().trim();
results.push(ob);
});
db.Article.create(results)
.then(function (dbArticles) {
res.json(dbArticles);
}).catch(function (err) {
return res.json(err);
});
});
After making above changes and even after the first one, my version of your code ran fine. So if you want you can continue on with your current version, or you may try reading some points of improvement.
Points of Improvements
1. Era of callbacks is long gone:
Convert your implementation to utilise Promises as they are more maintainable and easier to reason about. Here are the things you can do -
Change request library from request to axios or any one which supports Promises by default.
2. Make effective use of mongoose methods for insertion. You can perform bulk inserts of multiple statements in just one query. You may find docs on creating documents in mongodb quite helpful.
3. Start using some frontend task automation library such as puppeteer or nightmare.js for data scraping related task. Trust me, they make life a hell lot easier than using cheerio or any other library for the same. Their docs are really good and well maintained so you won't have have hard time picking these up.

node.js data consistency when iterating asynchronously

I have a tool who's basic idea is as follows:
//get a bunch of couchdb databases. this is an array
const jsonFile = require('jsonfile');
let dbList = getDbList();
const filePath = 'some/path/to/file';
const changesObject = {};
//iterate the db list. do asynchronous stuff on each iteration
dbList.forEach(function(db){
let merchantDb = nano.use(db);
//get some changes from the database. validate inside callback
merchantDb.get("_changes", function(err,changes){
validateChanges(changes);
changesObject['db'] = changes.someAttribute;
//write changes to file
jsonFile.writeFile(filePath, changesObject, function (err) {
if (err) {
logger.error("Unable to write to file: ");
}
});
})
const validateChanges = function(changes) {
if (!validateLogic(changes) sendAlertMail();
}
For performance improvements the iteration is not done synchronously. Therefore there can be multiple iterations running in 'parallel'. My question is can this cause any data inconsistencies and/or any issues with the file writing process?
Edit:
The same file gets written to on each iteration.
Edit:2
The changes are stored as a JSON object with key value pairs. The key being the db name.
If you're really writing to a single file, which you appear to be (though it's hard to be sure), then no; you have a race condition in which multiple callbacks will try to write to the same file, possibly at the same time (remember, I/O isn't done on the JavaScript thread in Node unless you use the *Sync functions), which will at best mean the last one wins and will at worst mean I/O errors because of overlap.
If you're writing to separate files for each db, then provided there's no cross-talk (shared state) amongst validateChanges, validateLogic, sendAlertMail, etc., that should be fine.
Just for detail: It will start tasks (jobs) getting the changes and then writing them out; the callbacks of the calls to get won't be run until later, when all of those jobs are queued.
You are creating closures in loops, but the way you're doing it is okay, both because you're doing it within the forEach callback and because you're not using db in the get callback (which would be fine with the forEach callback but not with some other ways you might loop arrays). Details on that aspect in this question's answers if you're interested.
This line is suspect, though:
let merchantDb = nano.use('db');
I suspect you meant (no quotes):
let merchantDb = nano.use(db);
For what it's worth, it sounds from the updates to the question and your various comments like the better solution would be not to write out the file separately each time. Instead, you want to gather up the changes and then write them out.
You can do that with the classic Node-callback APIs you're using like this:
let completed = 0;
//iterate the db list. do asynchronous stuff on each iteration
dbList.forEach(function(db) {
let merchantDb = nano.use(db);
//get some changes from the database. validate inside callback
merchantDb.get("_changes", function(err, changes) {
if (err) {
// Deal with the fact there was an error (don't return)
} else {
validateChanges(changes);
changesObject[db] = changes.someAttribute; // <=== NOTE: This line had 'db' rather than db, I assume that was meant to be just db
}
if (++completed === dbList.length) {
// All done, write changes to file
jsonFile.writeFile(filePath, changesObject, function(err) {
if (err) {
logger.error("Unable to write to file: ");
}
});
}
})
});

Categories