I have an array of ids, and I want to make an api request for each id, but I want to control how many requests are made per second, or better still, have only 5 open connections at any time, and when a connection is complete, fetch the next one.
Currently I have this, which just fires off all the requests at the same time:
_.each([1,2,3,4,5,6,7,8,9,10], function(issueId) {
github.fetchIssue(repo.namespace, repo.id, issueId, filters)
.then(function(response) {
console.log('Writing: ' + issueId);
writeIssueToDisk(fetchIssueCallback(response));
});
});
Personally, I'd use Bluebird's .map() with the concurrency option since I'm already using promises and Bluebird for anything async. But, if you want to see what a hand-coded counter scheme that restricts how many concurrent requests can run at once looks like, here's one:
function limitEach(collection, max, fn, done) {
var cntr = 0, index = 0, errFlag = false;
function runMore() {
while (!errFlag && cntr < max && index < collection.length) {
++cntr;
fn(collection[index++], function(err, data) {
--cntr;
if (errFlag) return;
if (err) {
errFlag = true;
done(err);
} else {
runMore();
}
});
}
if (!errFlag && cntr === 0 && index === collection.length) {
done();
}
}
runMore();
}
With Bluebird:
function fetch(id) {
console.log("Fetching " + id);
return Promise.delay(2000, id)
.then(function(id) {
console.log(" Fetched " + id);
});
}
var ids = [1,2,3,4,5,6,7,8,9,10];
Promise.map(ids, fetch, { concurrency: 3 });
<script src="https://cdnjs.cloudflare.com/ajax/libs/bluebird/3.3.1/bluebird.min.js"></script>
<!-- results pane console output; see http://meta.stackexchange.com/a/242491 -->
<script src="http://gh-canon.github.io/stack-snippet-console/console.min.js"></script>
Divide your data into as many arrays as you want concurrent connections. Schedule with setTimeout, and have the completion callback handle the rest of the sub-array.
Wrap the setTimeout in a function of its own so that the variable values are frozen to their values at the time of delayed_fetch() invocation.
function delayed_fetch(delay, namespace, id, issueIds, filters) {
setTimeout(
function() {
var issueId=issueIds.shift();
github.fetchIssue(namespace, id, issueId, filters).then(function(response) {
console.log('Writing: ' + issueId);
writeIssueToDisk(fetchIssueCallback(response));
delayed_fetch(0, namespace, id, issueIds, filters);
});
}, delay);
}
var i=0;
_.each([ [1,2] , [3,4], [5,6], [7,8], [9,10] ], function(issueIds) {
var delay=++i*200; // millisecond
delayed_fetch(delay, repo.namespace, repo.id, issueIds, filters);
});
i'd recommend using throat just for this: https://github.com/ForbesLindesay/throat
Using Bluebird
function getUserFunc(user) {
//Get a collection of user
}
function getImageFunc(id) {
//get a collection of image profile based on id of the user
}
function search(response) {
return getUsersFunc(response).then(response => {
const promises = response.map(items => return items.id);
const images = id => {
return getImagesFunc(id).then(items => items.image);
};
return Promise.map(promises, images, { concurrency: 5 });
});
}
Previously i used ES6 function Promise.all(), but it doesn't work like what i'm expecting. Then go with third party library bluebird.js and Work like a charm.
Related
I am trying to check my all 4 images is uploaded to server without any error, then redirect to another page so i am trying to perform some sync checking in my code (I have total 4 images in my imgResultAfterCompress array). below is my code:
if(Boolean(this.updateImage(data.AddId))===true)
{
this.router.navigate(['/job-in-hotels-india-abroad']);
}
updateImage(AddId:number):Observable<boolean>
{
this.cnt=0;
this.uploadingMsg='Uploading Images...';
this.imgResultAfterCompress.forEach( (value, key) => {
if(value!=='')
{
this.itemService.updateImage(this.employer.ID,AddId,key,value).subscribe(data=>{
if(data && data.status == 'success') {
this.uploadingMsg=this.uploadingMsg+'<br>Image No - '+(key+1)+' Uploaded.';
this.cnt++;
}
else
this.alertService.error(data.message);
});
}
if(this.cnt==4)
this.uploadingDone= true;
else
this.uploadingDone= false
});
return this.uploadingDone;
}
Every time i am getting cnt value is 0, i want its value = 4 (completely uploaded all images) then redirection will occurred.
The easier way is to wrap your observables into a single one, using zip operator
https://rxjs-dev.firebaseapp.com/api/index/function/zip
Thus once every request is finished successfully your zipped Observable will be fulfilled.
UPDATE:
This is how I think it should look like. I could miss something specific, but the global idea should be clear
redirect() {
this.updateImages(data.AddId).subscribe(
() => this.router.navigate(['/job-in-hotels-india-abroad']),
error => this.alertService.error(error.message)
)
}
updateImages(AddId: number): Observable<boolean[]> {
this.uploadingMsg = 'Uploading Images...';
const requests: Observable<boolean>[] = [];
this.imgResultAfterCompress.forEach((value, key) => {
if (!value) {
return;
}
requests.push(
this.itemService.updateImage(this.employer.ID, AddId, key, value)
.pipe(
tap(() => this.uploadingMsg = this.uploadingMsg + '<br>Image No - ' + (key + 1) + ' Uploaded.'),
switchMap((data) => {
if (data && data.status == 'success') {
return of(true)
} else {
throwError(new Error('Failed to upload image'));
}
})
)
)
});
return zip(...requests);
}
Finally got the desire result by using forkJoin
Service.ts:
public requestDataFromMultipleSources(EmpId: number,AddId:number,myFiles:any): Observable<any[]> {
let response: any[] = [];
myFile.forEach(( value, key ) => {
response.push(this.http.post<any>(this.baseUrl + 'furniture.php', {EmpId: EmpId, AddId:AddId,ImgIndex:key,option: 'updateAdImg', myFile:value}));
});
// Observable.forkJoin (RxJS 5) changes to just forkJoin() in RxJS 6
return forkJoin(response);
}
my.component.ts
let resCnt=0;
this.itemService.requestDataFromMultipleSources(this.employer.ID,AddId,this.imgResultAfterCompress).subscribe(responseList => {
responseList.forEach( value => {
if(value.status=='success')
{
resCnt++;
this.uploadingMsg=this.uploadingMsg+'<br>Image No - '+(value.ImgIndex+1)+' Uploaded.';
}
else
this.uploadingMsg=this.uploadingMsg+'<br>Problem In Uploading Image No - '+(value.ImgIndex+1)+', Please choose another one.';
});
if(resCnt === this.imgResultAfterCompress.length)
{
this.alertService.success('Add Posted Successfully');
this.router.navigate(['/job-in-hotels-india-abroad']);
}
else
this.alertService.error('Problem In Uploading Your Images');
});
You shouldn't try to make sync call within a loop. It is possible using async/await, but it's bad for app performance, and it is a common anti-pattern.
Look into Promise.all(). You could wrap each call into promise and redirect when all promises are resolved.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/all
I am trying to load data from the twitter api, getting user information and save that in a temporary array. That array will then be loaded on the page for viewing. The array is getting loaded by the API call, but it doesn't display.
I think I need to use an asynchronous thing like React or Angular, not sure. Would love some input!
function getUserIds (userId) {
T.get('statuses/retweeters/ids', { id: userId }, function (err, data, response) {
for(var i = 0; i < data.ids.length; i++){
ids.push(data.ids[i]);
}
getUserInfo();
});
}
function getUserInfo() {
for(var i = 0; i < ids.length; i++) {
T.get('users/lookup', { user_id: ids[i] }, function (err, data, response) {
names.push(data[0].screen_name);
pics.push(data[0].profile_image_url_https);
console.log(names);
});
}
res.render('display', {names: names, pics:pics});
}
The issue is that you are running ids.length async calls and those will finish some time in the future. You have to render your page only when they are all done. But, your for loop is synchronous so you are calling res.render() before any of them have finished. In addition, your T.get() calls may finish in any order (if that matters).
I would normally use promises for coordinating multiple asynchronous operations since it is a very, very good tool for that. But, if you aren't using promises, here's a simple technique to test when you have all your results back:
function getUserInfo() {
var names = [];
var pics = [];
for(var i = 0; i < ids.length; i++) {
T.get('users/lookup', { user_id: ids[i] }, function (err, data, response) {
if (err) {
// decide what to display if you get an API error
names.push("unknown due to API error");
} else {
names.push(data[0].screen_name);
pics.push(data[0].profile_image_url_https);
console.log(names);
}
if (names.length === ids.length) {
res.render('display', {names: names, pics:pics});
}
});
}
}
As I said above, this does not necessarily collect the results in order. If you need them in order, then you could do something like this:
function getUserInfo() {
var names = new Array(ids.length);
var pics = new Array(ids.length);
var doneCntr = 0;
ids.forEach(function(id, i) {
T.get('users/lookup', { user_id: id }, function (err, data, response) {
if (err) {
// decide what to display if you get an API error
names[i] = "unknown due to API error";
} else {
names[i] = data[0].screen_name;
pics[i] = data[0].profile_image_url_https;
}
++doneCntr;
if (doneCntr === ids.length) {
res.render('display', {names: names, pics: pics});
}
});
});
}
My preferred solution would to be to use Promise.all() and use a promisified version of T.get().
I need to remove all documents from my mongo db, which dont exists in new array with objects.
So I have array with objects like :
var items = [
{product_id:15, pr_name: 'a', description : 'desc'},
{product_id:44, pr_name: 'b', description : 'desc2'}
{product_id:32, pr_name: 'c', description : 'desc3'}];
and I have array with db values which I get by calling Model.find({}).
So now I do it in a 'straight' way:
async.each(products, function (dbProduct, callback) { //cycle for products removing
var equals = false;
async.each(items, function(product, callback){
if (dbProduct.product_id === product.product_id){
product.description = dbProduct.description;// I need to save desc from db product to new product
equals = true;
}
callback();
});
if (!equals) {
log.warn("REMOVE PRODUCT " + dbProduct.product_id);
Product.remove({ _id: dbProduct._id }, function (err) {
if (err) return updateDBCallback(err);
callback();
});
}
});
But its blocks the whole app and its very slow, because I have around 5000 values in my items array and in database too. So its very huge cycle numbers.
Maybe there can be a faster way?
UPDATE1
Using code below, from TbWill4321 answer:
var removeIds = [];
// cycle for products removing
async.each(products, function (dbProduct, callback) {
for ( var i = 0; i < items.length; i++ ) {
if (dbProduct.product_id === product.product_id) {
// I need to save desc from db product to new product
product.description = dbProduct.description;
// Return early for performance
return callback();
}
}
// Mark product to remove.
removeIds.push( dbProduct._id );
log.warn("REMOVE PRODUCT " + dbProduct.product_id);
return callback();
}, function() {
Product.remove({ _id: { $in: removeIds } }, function (err) {
if (err) return updateDBCallback(err);
// Continue Here.
// TODO
});
});
Its takes around 11 sec(blocks whole web-app) and takes 12 362 878 cycles for me.
So maybe somebody can advise me something?
The Async library does not execute synchronous code in an asynchronous fashion.
5000 items is not a huge number for JavaScript, as I've worked on Big Data set's with 5 million+ points and it doesn't take long. You can get better performance by structuring like this:
var removeIds = [];
// cycle for products removing
async.each(products, function (dbProduct, callback) {
for ( var i = 0; i < items.length; i++ ) {
if (dbProduct.product_id === product.product_id) {
// I need to save desc from db product to new product
product.description = dbProduct.description;
// Return early for performance
return callback();
}
}
// Mark product to remove.
removeIds.push( dbProduct._id );
log.warn("REMOVE PRODUCT " + dbProduct.product_id);
return callback();
}, function() {
Product.remove({ _id: { $in: removeIds } }, function (err) {
if (err) return updateDBCallback(err);
// Continue Here.
// TODO
});
});
Among the many problems you may have, off the top of my head you may want to start off by changing this bit:
Product.remove({ _id: dbProduct._id }, function (err) {
if (err) return updateDBCallback(err);
callback();
});
Being within a .each() call, you'll make one call to the database for each element you want to delete. It's better to store all the ids in one array and then make a single query to delete all elements that have an _id that is in that array. Like this
Product.remove({ _id: {$in: myArrayWithIds} }, function (err) {
if (err) return updateDBCallback(err);
callback();
});
On another note, since async will execute synchronously, node.js does offer setImmediate() (docs here), that will execute the function from within the event loop. So basically you can "pause" execution of new elements and serve any incoming requests to simulate "non-blocking" processing.
I have this event which is fired once every 2 seconds by external processes (it's a serial port receiving data) :
sp.on("data", function (rawData) {
try {
data = JSON.parse(rawData);
var collection = db.get('sensorsCollection');
collection.insert({
...
});
} catch (error) {
debug(error);
}
});
But I want to store data in database only once every, let's say 500 seconds to avoid overloading my database. How to achieve that ?
(Note : I tried to use underscore.js's function throttle but couldn't find how to pass argument to the function called in throttle so I couldn't pass my fresh data variable containing most recent data.)
Totally untested, but would something like this do what you want?:
(function() {
var collection = db.get('sensorsCollection');
var data = [];
sp.on("data", function (rawData) {
try {
data.push(JSON.parse(rawData));
} catch (error) {
debug(error);
}
});
setInterval(function() { // try-catch here too if necessary
collection.insert(data); // additional formatting?
data = [];
}, 500 * 1000);
}());
Editted to use setTimeout rather than throttle, which didn't make sense the way it was being used.
Store your datas, and send them every 500 seconds:
var my_datas=[];
sp.on("data", function (rawData){
try {
//store the data
my_datas.push(rawData);
});
} catch (error) {
debug(error);
}
});
setInterval(function(){
for(var i=0, len= my_datas.length; i<len; i++){
data = JSON.parse(my_datas[i]);
var collection = db.get('sensorsCollection');
collection.insert({
...
});
}
},500*1000);
I have an asynchronous function inside a for loop nested in another for loop.
// recipesArray is an array of arrays of objects
// recipeObject is an array of objects
// currentRecipe is an object
connectToDb(function(){
// LOOP 1
for (var i=0, l=recipesArray.length; i < l; i++) {
// recipeObject is an
var recipeObject = recipesArray[i];
// LOOP 2
for (var x=0, y=recipeObject.length; x < y; x++) {
var currentRecipe = recipeObject[x];
// this is an asynchronous function
checkRecipe(currentRecipe, function (theRecipe) {
if (theRecipe === undefined) {
console.log('RECIPE NOT FOUND');
} else {
console.log('RECIPE FOUND', theRecipe);
}
});
}
}
});
I need to add data to the recipesArray based on the results of the checkRecipe function.
I've been trying different things...
- do i try to keep track of i and x...
- do i try to have multiple callbacks...
- do i even need to do all of that, or is there some other way....
I also tried using the async library for node(which actually has been very helpful with other situations), but the forEach doesn't take objects(only an array).
Stuck.
Any suggestions would be greatly appreciated.
Assuming checkRecipe() can be run in parallel with no limits, here's how you might use async.each():
connectToDb(function() {
async.each(recipesArray, function(subArray, callback) {
async.each(subArray, function(currentRecipe, callback2) {
checkRecipe(currentRecipe, function(theRecipe) {
if (theRecipe === undefined)
return callback2(new Error('Recipe not found'));
callback2();
});
}, callback);
}, function(err) {
if (err)
return console.error('Error: ' + err);
// success, all recipes found
});
});