How to handle async code within a loop in JavaScript? - javascript

I am reaching out to a SongKick's REST API in my ReactJS Application, using Axios.
I do not know how many requests I am going to make, because in each request there is a limit of data that I can get, so in each request I need to ask for a different page.
In order to do so, I am using a for loop, like this:
while (toContinue)
{
axios.get(baseUrl.replace('{page}',page))
.then(result => {
...
if(result.data.resultsPage.results.artist === undefined)
toContinue = false;
});
page++;
}
Now, the problem that I'm facing is that I need to know when all my requests are finished. I know about Promise.All, but in order to use it, I need to know the exact URLs I'm going to get the data from, and in my case I do not know it, until I get empty results from the request, meaning that the last request is the last one.
Of course that axios.get call is asynchronous, so I am aware that my code is not optimal and should not be written like this, but I searched for this kind of problem and could not find a proper solution.
Is there any way to know when all the async code inside a function is finished?
Is there another way to get all the pages from a REST API without looping, because using my code, it will cause extra requests (because it is async) being called although I have reached empty results?

Actually your code will crash the engine as the while loop will run forever as it never stops, the asynchronous requests are started but they will never finish as the thread is blocked by the loop.
To resolve that use async / await to wait for the request before continuing the loop, so actually the loop is not infinite:
async function getResults() {
while(true) {
const result = await axios.get(baseUrl.replace('{page}', page));
// ...
if(!result.data.resultsPage.results.artist)
break;
}
}

Related

Firestore not writing all documents

I'm using forEach to write over 300 documents with data from an object literal.
It works 80% of the time -- all documents get written, the other times it only writes half or so, before the response gets sent and the function ends. Is there a way to make it pause and always work correctly?
Object.entries(qtable).forEach(([key, value]) => {
db.collection("qtable").doc(key).set({
s: value.s,
a: value.a
}).then(function(docRef) {
console.log("Document written with ID: ", docRef.id);
res.status(200).send(qtable);
return null;
})
Would it be bad pratice to just put a 2 second delay?
You are sending the response inside your loop, before the loop is complete. If you are using Cloud Functions (you didn't say), sending the response will terminate the function an clean up any extra work that hasn't completed.
You will need to make sure that you only send the response after all the async work is complete. This means you will have to pay attention to the promises returned by set() and use them to determine when to finally send the response. Leaning how promises work in JavaScript is crucial to writing functions that work properly.
You need to wait for the set() calls to conclude. They return a promise that you should deal with.
For instance, you can do this by pushing the return of set() to a promise array and awaiting for them outside the loop (with Promise.all()).
Or you can await each individual call, but in this case you need to change the forEach() to a normal loop, otherwise the await will not work inside a forEach() arrow function.
Also, you should probably set the response status just once, and outside the loop.

Trouble Fetching JSON files on server into Javascript Array

I'm quite new to using Javascript and particularly JSON, and I've been struggling to do this:
There is a JSON file on my web server which I am trying to access and parse into a JavaScript object. What I am trying to do is parse that JSON into an array and then further manipulate that array based on other user variables.
This is a sample of what the JSON looks like:
{"log":
[{
"name":"Al",
"entries":[8,12,16,19]},
{"name":"Steve",
"entries":[11,17,22]}]}
What I need to do is retrieve the array for one of the entries and store it in an array as a JavaScript object. What I have tried to do is this:
var entriesLogged;
fetch ('url to the json file').then(function(response){
return response.json();
}).then(function(data){
entriesLogged = data.log[0].entries;
});
However, I can't seem to get this to work and to assign the value to the variable in a way that persists outside of this scope. I have been able to output the value of the array using console.log, but I have not been able to actually work with and manipulate that data like an object. I'd ideally like to parse the JSON file from the server onto a global array.
Most of the tutorials I've come across so far have used the JSON file to output console logs or change the contents of html elements, however I need to retrieve the values from the JSON into a global array first.
Am I missing something here? Does anyone have advice on how to do this?
Best wishes,
Dom
Are you trying to manipulate the entriesLogged variable after the fetch promise chain? The fetch is asynchronous so this means that any code after the fetch chain will run before the fetch chain finishes.
var entriesLogged;
fetch('url to the json file').then(function(response){
return response.json();
}).then(function(data){
entriesLogged = data.log[0].entries;
});
// Code down here will run before the fetch chain finishes
// So if you're using entriesLogged down here, it will be null
console.log(entriesLogged); // entriesLogged is null
You might want to do something like:
.then(function(data){
entriesLogged = data.log[0].entries;
myFunction(entriesLogged);
});
function myFunction(myData) {
console.log(myData); // can access entriesLogged here
}
What is happening is that the call is async, it means that when you make that call a new thread is created and javascript automatically passes to the next line.
var entriesLogged;
fecth(something).then(function(data){entriesLogged = data[0]}); // this goes to another thread an takes e.g. 5 miliseconds but javascript does not wait for this to complete, it goes to the next line
console.log(entriesLogged); // this is executed before the call is finished
There are TONS of "answers" for variations of this question that don't actually solve the problem, they just describe the issue. (tl;dr solution is last block of code)
The issue is that .fetch('URL.json') and .json() are "asynchronous" calls. So the script will start working on that call, then continue on to the rest of the commands in your script while the .fetch() and .json() calls are still being worked on. If you then hit a command BEFORE one of these asynchronous calls is done, the rest of your script will have no data to work with.
There are tons of places to look for fully understanding the ins and outs of how async calls work in JS, for example here:
How do I return the response from an asynchronous call?
If you're like me and you just want the code to function synchronously, the quick and dirty solution is to create an async main loop and then to make sure you use the await keyword when calling asynchronous functions that you need to WAIT for the data to be populated on.
Example main loop:
(async () => {
// put main loop code here
})();
You would then put your code inside that main loop BUT you have to make sure that you're adding the await keyword in front of every asynchronous call you use. in the original example provided, you'll have to declare your .then() functions that need to use await as async as well. This is confusing if you don't fully understand the asynchronous nature of javascript and I'll simplify it later, but I wanted to provide a working version of the original code provided so you can understand what changed:
(async () => {
var entriesLogged;
await fetch('url_to_the_file.json')
.then( async function(response){
return await response.json();
})
.then( function(data){
entriesLogged = data;
});
})();
The .then() call is typically used for post-processing so you can do work inside of the asynchronous call... but since in this example we're deliberately making it synchronous, we can avoid all of that and clean up the code by doing individual simple commands:
// Start main loop
(async () => {
let response = await fetch('url_to_the_file.json');
let entriesLogged = await response.json();
})();
That should get you what you're looking for and hopefully save anyone like us the heartache of spending hours of time trying track down something that there is a simple solution for.
Also want to give a huge call out to the place I got the solution for this:
https://www.sitepoint.com/delay-sleep-pause-wait/

Run 1000 requests so that only 10 runs at a time

With node.js I want to http.get a number of remote urls in a way that only 10 (or n) runs at a time.
I also want to retry a request if an exception occures locally (m times), but when the status code returns an error (5XX, 4XX, etc) the request counts as valid.
This is really hard for me to wrap my head around.
Problems:
Cannot try-catch http.get as it is async.
Need a way to retry a request on failure.
I need some kind of semaphore that keeps track of the currently active request count.
When all requests finished I want to get the list of all request urls and response status codes in a list which I want to sort/group/manipulate, so I need to wait for all requests to finish.
Seems like for every async problem using promises are recommended, but I end up nesting too many promises and it quickly becomes uncypherable.
There are lots of ways to approach the 10 requests running at a time.
Async Library - Use the async library with the .parallelLimit() method where you can specify the number of requests you want running at one time.
Bluebird Promise Library - Use the Bluebird promise library and the request library to wrap your http.get() into something that can return a promise and then use Promise.map() with a concurrency option set to 10.
Manually coded - Code your requests manually to start up 10 and then each time one completes, start another one.
In all cases, you will have to manually write some retry code and as with all retry code, you will have to very carefully decide which types of errors you retry, how soon you retry them, how much you backoff between retry attempts and when you eventually give up (all things you have not specified).
Other related answers:
How to make millions of parallel http requests from nodejs app?
Million requests, 10 at a time - manually coded example
My preferred method is with Bluebird and promises. Including retry and result collection in order, that could look something like this:
const request = require('request');
const Promise = require('bluebird');
const get = Promise.promisify(request.get);
let remoteUrls = [...]; // large array of URLs
const maxRetryCnt = 3;
const retryDelay = 500;
Promise.map(remoteUrls, function(url) {
let retryCnt = 0;
function run() {
return get(url).then(function(result) {
// do whatever you want with the result here
return result;
}).catch(function(err) {
// decide what your retry strategy is here
// catch all errors here so other URLs continue to execute
if (err is of retry type && retryCnt < maxRetryCnt) {
++retryCnt;
// try again after a short delay
// chain onto previous promise so Promise.map() is still
// respecting our concurrency value
return Promise.delay(retryDelay).then(run);
}
// make value be null if no retries succeeded
return null;
});
}
return run();
}, {concurrency: 10}).then(function(allResults) {
// everything done here and allResults contains results with null for err URLs
});
The simple way is to use async library, it has a .parallelLimit method that does exactly what you need.

JavaScript Double Null Check and Locking

In a language with threads and locks it is easy to implement a lazy load by checking the value of a variable, if it's null then lock the next section of code, check the value again and then load the resource and assign. This prevents it from being loaded multiple times and causes threads after the first to wait for the first thread to complete the action that's needed.
Psuedo code:
if(myvar == null) {
lock(obj) {
if(myvar == null) {
myvar = getData();
}
}
}
return myvar;
JavaScript runs in a single thread, however, it still has this type of issue because of asynchronous execution while one call is waiting on a blocking resource. In this Node.js example:
var allRecords;
module.exports = getAllRecords(callback) {
if(allRecords) {
return callback(null,allRecords);
}
db.getRecords({}, function(err, records) {
if (err) {
return callback(err);
}
// Use existing object if it has been
// set by another async request to this
// function
allRecords = allRecords || partners;
return callback(null, allRecords);
});
}
I'm lazy loading all the records from a small DB table the first time this function is called and then returning the in-memory records on subsequent calls.
Problem: If multiple async requests are made to this function at the same time then the table is going to be loaded unnecessarily from the DB multiple times.
In order to solve this I could simulate a locking mechanism by creating a var lock; variable and setting it to true while the table is loading. I would then put the other async calls into a setTimeout() loop and check back on this variable every (say) 1 second until the data was available and then allow them to return.
The problems with that solution are:
It's fragile, what if the first async call throws and doesn't unset the lock.
How many times do we loop back into the timer before giving up?
How long should the timer be set for? In some environments 1 second might be way too long and inefficient.
Is there a best practise for solving this in JavaScript?
On the first call to the service, initialize an array. Start the fetch operation. Create a Promise, store it in the array.
On subsequent calls, if the data is there, return an already-fulfilled Promise. If not, add another Promise to the array and return that.
When the data arrives, resolve all the waiting Promise objects in the list. (You can throw away the list once the data's there.)
I really like the promise solution in the other answer -- very clever, very interesting. Promises aren't the dominent methodology, so you may need to educate the team. I'm going to go in another direction though.
What you're after is a memoize function -- an in-memory key/value cache of expensive results. JavaScript the Good Parts has a memoize sample towards the end. Lodash has a memoize function. These assume synchronous processing so don't account for your scenario -- which is to say they'd hit the database lots of times until one of the "threads" replied.
The async library also has a memoize function that does exactly what you want. In it's innards, it keeps a queue array of callbacks, and once it gets the answer, it both caches it and calls all the callbacks.
If you're into inventing, by all means, use promises. If you'd just like a plug-n-play answer, use async#memoize.

AngularJs: Have method return synchronously when it calls $http or $resource internally

Is there a way to wait on a promise so that you can get the actual result from it and return that instead of returning the promise itself? I'm thinking of something similar to how the C# await keyword works with Tasks.
Here is an example of why I'd like to have a method like canAccess() that returns true or false instead of a promise so that it can be used in an if statement. The method canAccess() would make an AJAX call using $http or $resource and then somehow wait for the promise to get resolved.
The would look something like this:
$scope.canAccess = function(page) {
var resource = $resource('/api/access/:page');
var result = resource.get({page: page});
// how to await this and not return the promise but the real value
return result.canAccess;
}
Is there anyway to do this?
In general that's a bad idea. Let me tell you why. JavaScript in a browser is basically a single threaded beast. Come to think of it, it's single threaded in Node.js too. So anything you do to not "return" at the point you start waiting for the remote request to succeed or fail will likely involve some sort of looping to delay execution of the code after the request. Something like this:
var semaphore = false;
var superImportantInfo = null;
// Make a remote request.
$http.get('some wonderful URL for a service').then(function (results) {
superImportantInfo = results;
semaphore = true;
});
while (!semaphore) {
// We're just waiting.
}
// Code we're trying to avoid running until we know the results of the URL call.
console.log('The thing I want for lunch is... " + superImportantInfo);
But if you try that in a browser and the call takes a long time, the browser will think your JavaScript code is stuck in a loop and pop up a message in the user's face giving the user the chance to stop your code. JavaScript therefore structures it like so:
// Make a remote request.
$http.get('some wonderful URL for a service').then(function (results) {
// Code we're trying to avoid running until we know the results of the URL call.
console.log('The thing I want for lunch is... " + results);
});
// Continue on with other code which does not need the super important info or
// simply end our JavaScript altogether. The code inside the callback will be
// executed later.
The idea being that the code in the callback will be triggered by an event whenever the service call returns. Because event driven is how JavaScript likes it. Timers in JavaScript are events, user actions are events, HTTP/HTTPS calls to send and receive data generate events too. And you're expected to structure your code to respond to those events when they come.
Can you not structure your code such that it thinks canAccess is false until such time as the remote service call returns and it maybe finds out that it really is true after all? I do that all the time in AngularJS code where I don't know what the ultimate set of permissions I should show to the user is because I haven't received them yet or I haven't received all of the data to display in the page at first. I have defaults which show until the real data comes back and then the page adjusts to its new form based on the new data. The two way binding of AngularJS makes that really quite easy.
Use a .get() callback function to ensure you get a resolved resource.
Helpful links:
Official docs
How to add call back for $resource methods in AngularJS
You can't - there aren't any features in angular, Q (promises) or javascript (at this point in time) that let do that.
You will when ES7 happens (with await).
You can if you use another framework or a transpiler (as suggested in the article linked - Traceur transpiler or Spawn).
You can if you roll your own implementation!
My approach was create a function with OLD javascript objects as follows:
var globalRequestSync = function (pUrl, pVerbo, pCallBack) {
httpRequest = new XMLHttpRequest();
httpRequest.onreadystatechange = function () {
if (httpRequest.readyState == 4 && httpRequest.status == 200) {
pCallBack(httpRequest.responseText);
}
}
httpRequest.open(pVerbo, pUrl, false);
httpRequest.send(null);
};
I recently had this problem and made a utility called 'syncPromises'. This basically works by sending what I called an "instruction list", which would be array of functions to be called in order. You'll need to call the first then() to kick things of, dynamically attach a new .then() when the response comes back with the next item in the instruction list so you'll need to keep track of the index.
// instructionList is array.
function syncPromises (instructionList) {
var i = 0,
defer = $q.defer();
function next(i) {
// Each function in the instructionList needs to return a promise
instructionList[i].then(function () {
var test = instructionList[i++];
if(test) {
next(i);
}
});
}
next(i);
return defer.promise;
}
This I found gave us the most flexibility.
You can automatically push operations etc to build an instruction list and you're also able to append as many .then() responses handlers in the callee function. You can also chain multiple syncPromises functions that will all happen in order.

Categories