How to sequentially handle asynchronous results from API? - javascript

This question might be a little vague, but I'll try my best to explain.
I'm trying to create an array of all of the tweets that I can retrieve from Twitter's API, but it limits each request to 200 returned tweets. How can I request to Twitter asynchronously up to the maximum limit of 3200 returned tweets? What do I mean is, is it possible to asynchronously call Twitter's API but build the array sequentially, making sure that the tweets are correctly sorted with regard to date?
So, I have an array:
var results = [];
and I'm using node's request module:
var request = require('request');
what I have right now (for just the limit of 200) is
request(options, function(err, response, body) {
body = JSON.parse(body);
for (var i = 0; i < body.length; i++) {
results.push(body[i].text);
}
return res.json(results);
});
I've looked into maybe using the 'promise' module, but it was confusing to understand. I tried using a while loop, but it got complicated because I couldn't follow the path that the server was taking.
Let me know if this didn't explain things well.
In the end, I want results to be an array populated with all of the tweets that the requests send.

I would suggest using request-promise instead of request. Here is my solution.
var rp = require('request-promise');
var tweets = [];
var promises = [];
for (var i =1; i< 10; i++){
var promise = rp(options);
promises.push(promise);
}
Promise.all(promises).then(function(data){
data.forEach(function(item){
// handle tweets here
});
return res.json(tweets);
});

Related

how to maintain the order of http requests using node js?

I have a bunch of data that I want to send to a server through http. However in the server side I need to process the data in the same order as they were sent(e.g. if the order of sending is elem1, elem2 and elem3, I would like to process elem1 first, then elem2 and then elem3). Since in http, there is no grantee that the order will be maintained I need some way to maintain the order.
Currently I am keeping the data in a queue and I send one element and await for the response. Once the response reaches me I send the next element.
while (!queue.isEmpty()) {
let data = queue.dequeue();
await sendDataToServer(data);
}
I am not very sure if this will actually work in a production environment and what will be the impact on the performance.
Any sort of help is much appreciated. Thank you
Sorry, I don't have enough reputation to comment, thus I am posting this as an answer.
Firstly, your code will work as intended.
However, since the server has to receive them in order, the performance won't be good. If you can change the server, I suggest you implement it like this:
Add an ID to each data item.
Send all the data items, no need to ensure order.
Create a buffer on the server, the buffer will be able to contain all the data items.
The server receives the items and puts them into the buffer in the right position.
Example code:
Client (see Promise.all)
let i = 0;
let promises = [];
await sendDataLengthToServer(queue.length());
while (!queue.isEmpty()) {
let data = queue.dequeue();
data.id = i;
// no need to wait for a request to finish
promises.push(sendDataToServer(data));
}
await Promise.all(promises);
Server (pseudo-code)
length = receiveDataLengthFromClient()
buffer = new Array(length)
int received = 0
onDataReceivedFromClient(data, {
received = received + 1
buffer[data.id] = data
if (received == length) {
// the buffer contains the data in the right order
}
})

Sending thousands of fetch requests crashes the browser. Out of memory

I was tasked with transferring a large portion of data using javascript and an API from one database to another. Yes I understand that there are better ways of accomplishing this task, but I was asked to try this method.
I wrote some javascript that makes a GET call to an api that returns an array of data, which I then turnaround and make calls to another api to send this data as individual POST requests.
What I have written so far seems to works fairly well, and I have been able to send over 50k individual POST requests without any errors. But I am having trouble when the number of POST requests increases past around 100k. I end up running out of memory and the browser crashes.
From what I understand so far about promises, is that there may be an issue where promises (or something else?) are still kept in heap memory after they are resolved, which results in running out of memory after too many requests.
I've tried 3 different methods to get all the records to POST successfully after searching for the past couple days. This has included using Bluebirds Promise.map, as well as breaking up the array into chunks first before sending them as POST requests. Each method seems to work up until it has processed about 100k records before it crashes.
async function amGetRequest(controllerName) {
try{
const amURL = "http://localhost:8081/api/" + controllerName;
const amResponse = await fetch(amURL, {
"method": "GET",
});
return await amResponse.json();
} catch (err) {
closeModal()
console.error(err)
}
};
async function brmPostRequest(controllerName, body) {
const brmURL = urlBuilderBRM(controllerName);
const headers = headerBuilderBRM();
try {
await fetch(brmURL, {
"method": "POST",
"headers": headers,
"body": JSON.stringify(body)
});
}
catch(error) {
closeModal()
console.error(error);
};
};
//V1.0 Send one by one and resolve all promises at the end.
const amResult = await amGetRequest(controllerName); //(returns an array of ~245,000 records)
let promiseArray = [];
for (let i = 0; i < amResult.length; i++) {
promiseArray.push(await brmPostRequest(controllerName, amResult[i]));
};
const postResults = await Promise.all(promiseArray);
//V2.0 Use bluebirds Promise.map with concurrency set to 100
const amResult = await amGetRequest(controllerName); //(returns an array of ~245,000 records)
const postResults = Promise.map(amResult, async data => {
await brmPostRequest(controllerName, data);
return Promise.resolve();
}, {concurrency: 100});
//V3.0 Chunk array into max 1000 records and resolve 1000 promises before looping to the next 1000 records
const amResult = await amGetRequest(controllerName); //(returns an array of ~245,000 records)
const numPasses = Math.ceil(amResult.length / 1000);
for (let i=0; i <= numPasses; i++) {
let subset = amResult.splice(0,1000);
let promises = subset.map(async (record) => {
await brmPostRequest(controllerName, record);
});
await Promise.all(promises);
subset.length = 0; //clear out temp array before looping again
};
Is there something that I am missing about getting these promises cleared out of memory after they have been resolved?
Or perhaps a better method of accomplishing this task?
Edit: Disclaimer - I'm still fairly new to JS and still learning.
"Well-l-l-l ... you're gonna need to put a throttle on this thing!"
Without (pardon me ...) attempting to dive too deeply into your code, "no matter how many records you need to transfer, you need to control the number of requests that the browser attempts to do at any one time."
What's probably happening right now is that you're stacking up hundreds or thousands of "promised" requests in local memory – but, how many requests can the browser actually transmit at once? That should govern the number of requests that the browser actually attempts to do. As each reply is returned, your software then decides whether to start another request and if so for which record.
Conceptually, you have so-many "worker bees," according to the number of actual network requests your browser can simultaneously do. Your software never attempts to launch more simultaneous requests than that: it simply launches one new request as each one request is completed. Each request, upon completion, triggers code that decides to launch the next one.
So – you never are "sending thousands of fetch requests." You're probably sending only a handful at a time, even though, in this you-controlled manner, "thousands of requests do eventually get sent."
As you are not intereted in the values delivered by brmPostRequest(), there's no point mapping the original array; neither the promises nor the results need to be acumulated.
Not doing so will save memory and may allow progress beyond the 100k sticking point.
async function foo() {
const amResult = await amGetRequest(controllerName);
let counts = { 'successes': 0, 'errors': 0 };
for (let i = 0; i < amResult.length; i++) {
try {
await brmPostRequest(controllerName, amResult[i]);
counts.successes += 1;
} catch(err) {
counts.errors += 1;
}
};
const console.log(counts);
}

why is .forEach() acting asynchronously? - node.js [duplicate]

This question already has answers here:
Why am I not able to access the result of multiple networking calls?
(2 answers)
Closed 5 years ago.
I'm trying to get some information from a web page using request to get the page and then cheerio to traverse the DOM to the specific part I need I'm repeating this process for multiple elements in an array using array.forEach using this code:
const cheerio = require('cheerio');
const request = require('request');
var i = 0;
var rates = [];
['AUD', 'CAD'].forEach(function(currancy){
var url = "https://www.google.com/finance/converter?a=1&from=USD&to=" + currancy
request(url , function(error, res , body){
const $ = cheerio.load(body);
var temp = $('.bld');
var rate = temp.text();
console.log(rate);
rates[i]= rate;
i++;
})
});
console.log('done');
the result I'm expecting is something like this:
1.31 AUD
1.28 CAD
done
but I'm getting this insted:
done
1.31 AUD
1.28 CAD
can you tell me why is array.forEach not blocking my code?
You are printing 'done' before any of your http requests to google return.
You loop though the currencies, making a call to google for each one, and print 'done'. Then the calls start to return (in random order, btw) and you print the results.
So, the forEach isn't asynchronous, but the http requests are.

Get Cloudflare's HTTP_CF_IPCOUNTRY header with javascript?

There are many SO questions how to get http headers with javascript, but for some reason they don't show up HTTP_CF_IPCOUNTRY header.
If I try to do with php echo $_SERVER["HTTP_CF_IPCOUNTRY"];, it works, so CF is working just fine.
Is it possible to get this header with javascript?
#Quentin's answer stands correct and holds true for any javascript client trying to access server header's.
However, since this question is specific to Cloudlfare and specific to getting the 2 letter country ISO normally in the HTTP_CF_IPCOUNTRY header, I believe I have a work-around that best befits the question asked.
Below is a code excerpt that I use on my frontend Ember App, sitting behind Cloudflare... and varnish... and fastboot...
function parseTrace(url){
let trace = [];
$.ajax(url,
{
success: function(response){
let lines = response.split('\n');
let keyValue;
lines.forEach(function(line){
keyValue = line.split('=');
trace[keyValue[0]] = decodeURIComponent(keyValue[1] || '');
if(keyValue[0] === 'loc' && trace['loc'] !== 'XX'){
alert(trace['loc']);
}
if(keyValue[0] === 'ip'){
alert(trace['ip']);
}
});
return trace;
},
error: function(){
return trace;
}
}
);
};
let cfTrace = parseTrace('/cdn-cgi/trace');
The performance is really really great, don't be afraid to call this function even before you call other APIs or functions. I have found it to be as quick or sometimes even quicker than retrieving static resources from Cloudflare's cache. You can run a profile on Pingdom to confirm this.
Assuming you are talking about client side JavaScript: no, it isn't possible.
The browser makes an HTTP request to the server.
The server notices what IP address the request came from
The server looks up that IP address in a database and finds the matching country
The server passes that country to PHP
The data never even goes near the browser.
For JavaScript to access it, you would need to read it with server side code and then put it in a response back to the browser.
fetch('https://cloudflare-quic.com/b/headers').then(res=>res.json()).then(data=>{console.log(data.headers['Cf-Ipcountry'])})
Reference:
https://cloudflare-quic.com/b
https://cloudflare-quic.com/b/headers
Useful Links:
https://www.cloudflare.com/cdn-cgi/trace
https://github.com/fawazahmed0/cloudflare-trace-api
Yes you have to hit the server - but it doesn't have to be YOUR server.
I have a shopping cart where pretty much everything is cached by Cloudflare - so I felt it would be stupid to go to MY server to get just the countrycode.
Instead I am using a webworker on Cloudflare (additional charges):
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
var countryCode = request.headers.get('CF-IPCountry');
return new Response(
JSON.stringify({ countryCode }),
{ headers: {
"Content-Type": "application/json"
}});
}
You can map this script to a route such as /api/countrycode and then when your client makes an HTTP request it will return essentially instantly (for me it's about 10ms).
/api/countrycode
{
"countryCode": "US"
}
Couple additional things:
You can't use webworkers on all service levels
It would be best to deploy an actual webservice on the same URL as a backup (if webworkers aren't enabled or supported or for during development)
There are charges but they should be neglibible
It seems like there's a new feature where you can map a single path to a single script. That's what I am doing here. I think this used to be an enterprise only feature but it's now available to me so that's great.
Don't forget that it may be T1 for TOR network
Since I wrote this they've exposed more properties on Request.cf - even on lower priced plans:
https://developers.cloudflare.com/workers/runtime-apis/request#incomingrequestcfproperties
You can now get city, region and even longitude and latitude, without having to use a geo lookup database.
I've taken Don Omondi's answer, and converted it to a promise function for ease of use.
function get_country_code() {
return new Promise((resolve, reject) => {
var trace = [];
jQuery.ajax('/cdn-cgi/trace', {
success: function(response) {
var lines = response.split('\n');
var keyValue;
for (var index = 0; index < lines.length; index++) {
const line = lines[index];
keyValue = line.split('=');
trace[keyValue[0]] = decodeURIComponent(keyValue[1] || '');
if (keyValue[0] === 'loc' && trace['loc'] !== 'XX') {
return resolve(trace['loc']);
}
}
},
error: function() {
return reject(trace);
}
});
});
}
usage example
get_country_code().then((country_code) => {
// do something with the variable country_code
}).catch((err) => {
// caught the error, now do something with it
});

node-ntwitter and twitter API Limites

Question related to the API limits, It might be something I'm just missing, not sure though:
If I do this:
twit.showUser(ids, function(error, response) {
console.log(response)
}
Where {ids} is an Array, with length < 100, all is well.
When I do the same and IDs is > 100, it fails.
This is based on:
https://dev.twitter.com/docs/api/1.1/get/users/lookup
And specifically:
for up to 100 users per request
Is this somehow managed in ntwitter module or do I need to manage this outside? if so, any recommendation on how to manage that?
Or, outside of the node-ntwitter module, how would you recommend solving this in a clean way if I want to send back a composite json of all responses from showUser() ?
for up to 100 users per request
means you can only request 100 ids at a time. You can run a for-loop with an increase of 100:
for(var i = 0; i < ids.length; i + 100) {
requestIds = ids.slice(i, i+99)
twit.showUser(requestIds, function(error, response) {
console.log(response)
}
}
(untested)
As it's node.js, you might want to have it asynchronous. Check out forEachLimit and forEachSeries.

Categories