I have hundreds of element to get from MongoDB database and print them in the front-end.
Fetch all into single one request could decrease performance as it carries big payload in the body response.
So I'm looking for a solution to split my Angular request into several and with the constraint to be simultaneous.
Example :
MONGODB
Collection: Elements (_id, name, children, ...)
Documents: 10000+
But we only need ~100-500 elements
ANGULAR :
const observables = [];
const iteration = 5, idParent = 'eufe4839ffcdsj489483'; // example
for (let i = 0; i < iteration; i++) {
observables.push(
this.myHttpService.customGetMethod<Element>(COLLECTION_NAME, endpoint, 'idParent=' + idParent + '&limit=??')); // url with query
}
forkJoin(observables).subscribe(
data => {
this.elements.push(data[0]);
this.elements.push(data[1]);
},
err => console.error(err)
);
I use forkJoin because I need simultaneous requests for better performance.
The idea is to send multiple requests to the back-end with different limit parameter values and get the whole set of elements into the data object at the end.
The only purpose is to split request to avoid big latency with maybe errors due to the size of the payload body.
How can I proceed with the given stack to perform such this operation ? I would like to avoid the use of websockets.
I think fork.join is used when you want to resolve all the observables in parallel, but you need to wait for all the request what if one fails? forkJoin will complete on first error as soon as it encounters it and you kinda can't know from which observable it came from , but if you handle errors inside the inner observables you can easily achieve that.
const observables = [];
const iteration = 5, idParent = 'eufe4839ffcdsj489483'; // example
for (let i = 0; i < iteration; i++) {
observables.push(
this.myHttpService.customGetMethod<Element>(COLLECTION_NAME, endpoint, 'idParent=' + idParent + '&limit=??')).pipe(catchError(() => {
throw `an Error request #: ${i}`;
}); // url with query
}
forkJoin(observables).subscribe(
data => {
this.elements.push(data[0]);
this.elements.push(data[1]);
},
err => console.error(err)
);
The other way could be to introduce the infinite-scroll or ngx-infinite-scroll if you want to show the data as list.
You can also add the pagination in the frontend if that matches your requirement. There is one lib which might help you: Syncfusion grids. There can be other ways too to improve performance at the backend side too.
Related
I am trying to resolve the array of promises together. Not sure how to do it. Let me share the pseudo code for it.
async function sendNotification(user, notificationInfo) {
const options = {
method: 'POST',
url: 'http://xx.xx.xx:3000/notification/send',
headers:
{ 'Content-Type': 'application/json' },
body:
{ notificationInfo, user },
json: true,
};
console.log('sent');
return rp(options);
}
I have wrapped the sendNotification method in another method which returns the promise of rp(request-promise) module.
Next i am pushing this sendNotification method in array of promise , something like this
const notificationWorker = [];
for (const key3 in notificationObject) {
if(notificationObject[key3].users.length > 0) {
notificationWorker.push(sendNotification(notificationObject[key3].users, notificationObject[key3].payload)); // problem is notification are going as soon as i am pushing in notificationWorker array.
}
}
// task 1 - send all notifications
const result = await Promise.all(notificationWorker); // resolving all notification promises together
// task 2 - update values in db , after sending all notifications
const result2 = await Promise.all(updateWorker); // update some values in db
In above code , my problem is notifications are going as soon as i am pushing it in notificationWorker array. I want all notifications to go together, when i run await Promise.all(notificationWorker)
Not sure , how to achieve what i am trying?
I understood the question partially , but then i feel this is difference between nodejs working concurrently and we trying to achieve parallelism , isn't that so.
Nodejs just switching between the tasks by , and not actually parallely doing it.Child Process might help you in that case.
So for eg. if you go through a snippet
function done(i){
try{
return new Promise((resolve,reject)=>{
console.log(i);
resolve("resolved " + i + "th promise");
})
}catch(e){
return null;
}
}
let promises = [];
for(let i=0;i < 100000; i++){
promises.push(done(i));
}
So console starts even when you dont call Promise.all right ? this was your question but infact Promise.all should not suffice your thing , should go by spwaning child processes to achieve parallelism to some extent.
The point i am trying to make it you are potraying the question to do something like first generate array of promises and start all of them once when Promise.all is called but in my opinion Promise.all also will be running concurrently not giving you what you want to achieve.
Something like this - https://htayyar.medium.com/multi-threading-in-javascript-with-paralleljs-10e1f7a1cf32 || How to create threads in nodejs
Though most of these cases show up when we need to do a cpu intensive task etc but here we can achieve something called map reduce to distribute you array of users in parts and start that array to loop and send notifications.
All of the solutions, i am presenting is to achieve some kind of parallelism but i dont think sending array of huge amount of users would ever be done easily (with less resources - instance config etc) at same instant
I know this is a pretty basic question, but I can't get anything working.
I have a list of URL's and I need to get the HTML from them using NodeJS.
I have tried using Axios, but the response returned is always undefined.
I am hitting the endpoint /process-logs with a post request and the body consists of logFiles (which is an array).
router.post("/process-logs", function (req, res, next) {
fileStrings = req.body.logFiles;
for (var i = 0; i < fileStrings.length; i++) {
axios(fileStrings[i]).then(function (response) {
console.log(response.body);
});
}
res.send("done");
});
A sample fileString is of the form https://amazon-artifacts.s3.ap-south-1.amazonaws.com/q-120/log1.txt.
How can I parallelize this process to do the same task for multiple files at a time?
I can think of two approaches here:
the first one is to use ES6 promises (promise.all) and Async/Await feature, by chunking the fileStrings array into n chunks. This is a basic approach and you have to handle a lot of cases.
This is a general idea of the flow i am thinking of:
async function handleChunk (chunk) {
const toBeFullfilled = [];
for (const file of chunk) {
toBeFullfilled.push(axios.get(file)); // replace axios.get with logic per file
}
return Promise.all(toBeFullfilled);
}
async function main() {
try {
const fileStrings = req.body.logfiles;
for (i; i < fileStrings; i += limit) {
let chunk = fileStrings.slice(i, i+limit);
const results = await handleChunk(chunk);
console.log(results);
}
}
catch (e) {
console.log(e);
}
}
main().then(() => { console.log('done')}).catch((e) => { console.log(e) });
one of the drawbacks is we are processing chunks sequentially (chunk by chunk, still better than file-by-file), one enhancement could be to chunk the fileStrings ahead of time and process the chunks concurrently (it really depends on what you're trying to achieve and what are the limitations you have)
the second approach is to use Async library , which has many control flows and collections that allows you to configure the concurreny ... etc. (i really recommend using this approach)
You should have a look at Async's Queue Control Flow to run same task for multiple files concurrently.
So I have an Angular component.
With some array objects containing data I want to work with:
books: Book[] = [];
reviews: Review[] = [];
This is what my ngOnInit() looks like:
ngOnInit(): void {
this.retrieveBooks();
this.retrieveReviews();
this.setRatingToTen();
}
With this I write Books, and Reviews to object arrays.
This is done through a "subscription" to data through services:
retrieveBooks(): void {
this.bookService.getAll()
.subscribe(
data => {
this.books = data;
},
error => {
console.log(error);
}
);
}
retrieveReviews(): void {
this.reviewService.getAll()
.subscribe(
data => {
this.reviews = data;
},
error => {
console.log(error);
});
}
So this next function I have is just an example of "working with the data".
In this example, I just want to change all of the totalratings to 10 for each Book:
setRatingToTen(): void {
this.books.forEach(element => {
element.totalrating = 10;
});
console.log(this.books);
}
The problem I have been trying to wrap my head around is this:
this.books is an empty array.
I THINK the reason is because this function is running before the data subscription.
IF this is the case, then my understanding of ngOnInit must not be right.
I thought it would call the function in order.
Maybe that's still the case, it's just that they don't complete in order.
So my questions are:
1. Why is it an empty array?
(was I right? or is there more to it?)
2. How do Angular developers write functions so they operate in a desired order?
(since the data needs to be there so I can work with it, how do I avoid this issue?)
(3.) BONUS question:
(if you have the time, please and thank you)
My goal is to pull this.reviews.rating for each book where this.reviews.title equals this.books.title, get an average score; and then overwrite the "0" placeholder of this.books.totalrating with the average. How could I re-write the setRatingToTen() function to accomplish this?
Here is one of solution using forkJoin method in rxjs .
you can check this for details https://medium.com/#swarnakishore/performing-multiple-http-requests-in-angular-4-5-with-forkjoin-74f3ac166d61
Working demo : enter link description here
ngOnInit:
ngOnInit(){
this.requestDataFromMultipleSources().subscribe(resList => {
this.books = resList[0];
this.reviews = resList[1];
this.setRatingToTen(this.books,this.reviews);
})
}
forkJoin method:
public requestDataFromMultipleSources(): Observable<any[]> {
let response1 = this.retrieveBooks();
let response2 = this.retrieveReviews();
// Observable.forkJoin (RxJS 5) changes to just forkJoin() in RxJS 6
return forkJoin([response1, response2]);
}
Other methods:
retrieveBooks(): void {
this.bookService.getAll()
.subscribe(
data => {
this.books = data;
},
error => {
console.log(error);
}
);
}
retrieveReviews(): void {
this.reviewService.getAll()
.subscribe(
data => {
this.reviews = data;
},
error => {
console.log(error);
});
}
setRatingToTen(books, reviews): void {
this.books.forEach(element => {
element.totalrating = 10;
});
console.log(this.books);
}
Angular makes heavy use of observables to handle variety of asynchronous operations. Making server side requests (through HTTP) is one of those.
Your first two questions clearly reflect you are ignoring the asynchronous nature of observables.
Observables are lazy Push collections of multiple values. detailed link
Means observable response would be pushed over time in an asynchronous way. You can not guarantee which of the two distinct functions would return its response first.
Having said that, rxjs library (Observables are also part of this library and angular borrowed them from here) provides a rich collection of operators that you can use to manipulate observables.
With the above explanation, here is one by one answer to your questions.
1. Why is it an empty array?
Because you are thinking in terms of synchronous sequential flow of code, where one method would get called only after the other has finished with its working. But here retrieveBooks and retrieveReviews both are making asynchronous (observable) calls and then subscribing to it. This means there is no guarantee when their response would be received. Meanwhile the hit to setRatingToTen had already been made, at that point in time books array was empty.
2. How do Angular developers write functions so they operate in a desired order?
Angular developer would understand the nature of asynchronous observable calls, and would pipe the operators in an order so that they are sure they have the response in hand before performing any further operation on the observable stream.
(3.) BONUS question:
Your requirement specifies that you must first have the response of both observables at hand before performing any action. For that forkJoin rxjs operator suits your need. Documentation for this operator say
One common use case for this is if you wish to issue multiple requests on page load (or some other event) and only want to take action when a response has been received for all. detailed link
Not sure about your average score strategy, but here is an example code how you would achieve your purpose.
ngOnInit(){
let req1$ = this.bookService.getAll();
let req2$ = this.reviewService.getAll();
forkJoin(req1$, req2$).subscribe(([response1, response2])=>{
for(let book of response1) //loop through book array first
{
for(let review of response2) //inner loop of reviews
{
if(book.title == review.title)
{
//complete your logic here..
}
}
}
});
}
I can't seem to get the article duplicates out of my web scraper results, this is my code:
app.get("/scrape", function (req, res) {
request("https://www.nytimes.com/", function (error, response, html) {
// Load the HTML into cheerio and save it to a variable
// '$' becomes a shorthand for cheerio's selector commands, much like jQuery's '$'
var $ = cheerio.load(html);
var uniqueResults = [];
// With cheerio, find each p-tag with the "title" class
// (i: iterator. element: the current element)
$("div.collection").each(function (i, element) {
// An empty array to save the data that we'll scrape
var results = [];
// store scraped data in appropriate variables
results.link = $(element).find("a").attr("href");
results.title = $(element).find("a").text();
results.summary = $(element).find("p.summary").text().trim();
// Log the results once you've looped through each of the elements found with cheerio
db.Article.create(results)
.then(function (dbArticle) {
res.json(dbArticle);
}).catch(function (err) {
return res.json(err);
});
});
res.send("You scraped the data successfully.");
});
});
// Route for getting all Articles from the db
app.get("/articles", function (req, res) {
// Grab every document in the Articles collection
db.Article.find()
.then(function (dbArticle) {
res.json(dbArticle);
})
.catch(function (err) {
res.json(err);
});
});
Right now I am getting five copies of each article sent to the user. I have tried db.Article.distinct and various versions of this to filter the results down to only unique articles. Any tips?
In Short:
Switching the var results = [] from an Array to an Object var results = {} did the trick for me. Still haven't figured out the exact reason for the duplicate insertion of documents in database, will update as soon I find out.
Long Story:
You have multiple mistakes and points of improvement there in your code. I will try pointing them out:
Let's follow them first to make your code error free.
Mistakes
1. Although mongoose's model.create, new mongoose() does seem to work fine with Arrays but I haven't seen such a use before and it does not even look appropriate.
If you intend to create documents one after another then represent your documents using an object instead of an Array. Using an array is more mainstream when you intend to create multiple documents at once.
So switch -
var results = [];
to
var results = {};
2. Sending response headers after they are already sent will create for you an error. I don't know if you have already noticed it or not but its pretty much clear upfront as once the error is popped up the remaining documents won't get stored because of PromiseRejection Error if you haven't setup a try/catch block.
The block inside $("div.collection").each(function (i, element) runs asynchronously so your process control won't wait for each document to get processed, instead it would immediately execute res.send("You scraped the data successfully.");.
This will effectively terminate the Http connection between the client and the server and any further issue of response termination calls like res.json(dbArticle) or res.json(err) will throw an error.
So, just comment the res.json statements inside the .create's then and catch methods. This will although terminate the response even before the whole articles are saved in the DB but you need not to worry as your code would still work behind the scene saving articles in database for you (asynchronously).
If you want your response to be terminated only after you have successfully saved the data then change your middleware implementation to -
request('https://www.nytimes.com', (err, response, html) => {
var $ = cheerio.load(html);
var results = [];
$("div.collection").each(function (i, element) {
var ob = {};
ob.link = $(element).find("a").attr("href");
ob.title = $(element).find("a").text();
ob.summary = $(element).find("p.summary").text().trim();
results.push(ob);
});
db.Article.create(results)
.then(function (dbArticles) {
res.json(dbArticles);
}).catch(function (err) {
return res.json(err);
});
});
After making above changes and even after the first one, my version of your code ran fine. So if you want you can continue on with your current version, or you may try reading some points of improvement.
Points of Improvements
1. Era of callbacks is long gone:
Convert your implementation to utilise Promises as they are more maintainable and easier to reason about. Here are the things you can do -
Change request library from request to axios or any one which supports Promises by default.
2. Make effective use of mongoose methods for insertion. You can perform bulk inserts of multiple statements in just one query. You may find docs on creating documents in mongodb quite helpful.
3. Start using some frontend task automation library such as puppeteer or nightmare.js for data scraping related task. Trust me, they make life a hell lot easier than using cheerio or any other library for the same. Their docs are really good and well maintained so you won't have have hard time picking these up.
I need to perform lots of findOneAndUpdate() operations using mongoose as there is no way to perform an atomic operation in bulk. Therefore I create a promise array in a for loop which will be resolved afterwards. Unfortunately this takes ~2-3 seconds and during that time my Express application can't process any new requests.
The code:
const promiseArray = []
for (let i = 0; i < 1500; i++) {
const p = PlayerProfile.findOneAndUpdate(filter, updateDoc)
promiseArray.push(p)
}
return Promise.all(promiseArray).then((values) => {
// Process the values
})
Question:
How can I avoid that my Express application becomes completely unresponsive to new requests while it's resolving this promise?
More context information:
I am trying to update and return many documents with a atomic operation, hence the big for loop. It's basically selecting a document and setting up a lock for this document.
Try using update with the multi option:
PlayerProfile.update(filter, updateDoc, { multi: true }, function(err, result) {
// Do something
})
The signature is:
Model.update(conditions, update, options, callback)