node-ntwitter and twitter API Limites - javascript

Question related to the API limits, It might be something I'm just missing, not sure though:
If I do this:
twit.showUser(ids, function(error, response) {
console.log(response)
}
Where {ids} is an Array, with length < 100, all is well.
When I do the same and IDs is > 100, it fails.
This is based on:
https://dev.twitter.com/docs/api/1.1/get/users/lookup
And specifically:
for up to 100 users per request
Is this somehow managed in ntwitter module or do I need to manage this outside? if so, any recommendation on how to manage that?
Or, outside of the node-ntwitter module, how would you recommend solving this in a clean way if I want to send back a composite json of all responses from showUser() ?

for up to 100 users per request
means you can only request 100 ids at a time. You can run a for-loop with an increase of 100:
for(var i = 0; i < ids.length; i + 100) {
requestIds = ids.slice(i, i+99)
twit.showUser(requestIds, function(error, response) {
console.log(response)
}
}
(untested)
As it's node.js, you might want to have it asynchronous. Check out forEachLimit and forEachSeries.

Related

how to maintain the order of http requests using node js?

I have a bunch of data that I want to send to a server through http. However in the server side I need to process the data in the same order as they were sent(e.g. if the order of sending is elem1, elem2 and elem3, I would like to process elem1 first, then elem2 and then elem3). Since in http, there is no grantee that the order will be maintained I need some way to maintain the order.
Currently I am keeping the data in a queue and I send one element and await for the response. Once the response reaches me I send the next element.
while (!queue.isEmpty()) {
let data = queue.dequeue();
await sendDataToServer(data);
}
I am not very sure if this will actually work in a production environment and what will be the impact on the performance.
Any sort of help is much appreciated. Thank you
Sorry, I don't have enough reputation to comment, thus I am posting this as an answer.
Firstly, your code will work as intended.
However, since the server has to receive them in order, the performance won't be good. If you can change the server, I suggest you implement it like this:
Add an ID to each data item.
Send all the data items, no need to ensure order.
Create a buffer on the server, the buffer will be able to contain all the data items.
The server receives the items and puts them into the buffer in the right position.
Example code:
Client (see Promise.all)
let i = 0;
let promises = [];
await sendDataLengthToServer(queue.length());
while (!queue.isEmpty()) {
let data = queue.dequeue();
data.id = i;
// no need to wait for a request to finish
promises.push(sendDataToServer(data));
}
await Promise.all(promises);
Server (pseudo-code)
length = receiveDataLengthFromClient()
buffer = new Array(length)
int received = 0
onDataReceivedFromClient(data, {
received = received + 1
buffer[data.id] = data
if (received == length) {
// the buffer contains the data in the right order
}
})

Sending thousands of fetch requests crashes the browser. Out of memory

I was tasked with transferring a large portion of data using javascript and an API from one database to another. Yes I understand that there are better ways of accomplishing this task, but I was asked to try this method.
I wrote some javascript that makes a GET call to an api that returns an array of data, which I then turnaround and make calls to another api to send this data as individual POST requests.
What I have written so far seems to works fairly well, and I have been able to send over 50k individual POST requests without any errors. But I am having trouble when the number of POST requests increases past around 100k. I end up running out of memory and the browser crashes.
From what I understand so far about promises, is that there may be an issue where promises (or something else?) are still kept in heap memory after they are resolved, which results in running out of memory after too many requests.
I've tried 3 different methods to get all the records to POST successfully after searching for the past couple days. This has included using Bluebirds Promise.map, as well as breaking up the array into chunks first before sending them as POST requests. Each method seems to work up until it has processed about 100k records before it crashes.
async function amGetRequest(controllerName) {
try{
const amURL = "http://localhost:8081/api/" + controllerName;
const amResponse = await fetch(amURL, {
"method": "GET",
});
return await amResponse.json();
} catch (err) {
closeModal()
console.error(err)
}
};
async function brmPostRequest(controllerName, body) {
const brmURL = urlBuilderBRM(controllerName);
const headers = headerBuilderBRM();
try {
await fetch(brmURL, {
"method": "POST",
"headers": headers,
"body": JSON.stringify(body)
});
}
catch(error) {
closeModal()
console.error(error);
};
};
//V1.0 Send one by one and resolve all promises at the end.
const amResult = await amGetRequest(controllerName); //(returns an array of ~245,000 records)
let promiseArray = [];
for (let i = 0; i < amResult.length; i++) {
promiseArray.push(await brmPostRequest(controllerName, amResult[i]));
};
const postResults = await Promise.all(promiseArray);
//V2.0 Use bluebirds Promise.map with concurrency set to 100
const amResult = await amGetRequest(controllerName); //(returns an array of ~245,000 records)
const postResults = Promise.map(amResult, async data => {
await brmPostRequest(controllerName, data);
return Promise.resolve();
}, {concurrency: 100});
//V3.0 Chunk array into max 1000 records and resolve 1000 promises before looping to the next 1000 records
const amResult = await amGetRequest(controllerName); //(returns an array of ~245,000 records)
const numPasses = Math.ceil(amResult.length / 1000);
for (let i=0; i <= numPasses; i++) {
let subset = amResult.splice(0,1000);
let promises = subset.map(async (record) => {
await brmPostRequest(controllerName, record);
});
await Promise.all(promises);
subset.length = 0; //clear out temp array before looping again
};
Is there something that I am missing about getting these promises cleared out of memory after they have been resolved?
Or perhaps a better method of accomplishing this task?
Edit: Disclaimer - I'm still fairly new to JS and still learning.
"Well-l-l-l ... you're gonna need to put a throttle on this thing!"
Without (pardon me ...) attempting to dive too deeply into your code, "no matter how many records you need to transfer, you need to control the number of requests that the browser attempts to do at any one time."
What's probably happening right now is that you're stacking up hundreds or thousands of "promised" requests in local memory – but, how many requests can the browser actually transmit at once? That should govern the number of requests that the browser actually attempts to do. As each reply is returned, your software then decides whether to start another request and if so for which record.
Conceptually, you have so-many "worker bees," according to the number of actual network requests your browser can simultaneously do. Your software never attempts to launch more simultaneous requests than that: it simply launches one new request as each one request is completed. Each request, upon completion, triggers code that decides to launch the next one.
So – you never are "sending thousands of fetch requests." You're probably sending only a handful at a time, even though, in this you-controlled manner, "thousands of requests do eventually get sent."
As you are not intereted in the values delivered by brmPostRequest(), there's no point mapping the original array; neither the promises nor the results need to be acumulated.
Not doing so will save memory and may allow progress beyond the 100k sticking point.
async function foo() {
const amResult = await amGetRequest(controllerName);
let counts = { 'successes': 0, 'errors': 0 };
for (let i = 0; i < amResult.length; i++) {
try {
await brmPostRequest(controllerName, amResult[i]);
counts.successes += 1;
} catch(err) {
counts.errors += 1;
}
};
const console.log(counts);
}

Can someone explain an ENOBUFS error?

I'm making a bunch calls to a database that contains a large amount of data on a Windows 7 64 bit OS. As the calls are queuing up I get the error (for ever HTTP call after the first error):
Error: connect ENOBUFS *omitted* - Local (undefined:undefined)
From my google searching I've learned that this error means that my buffer has grown too large and my system's memory can no longer handle the buffer's size.
But I don't really understand what this means. I'm using node.js to with an HTTPS library to handle my requests. When the requests are getting queued and the sockets are opening is the buffer's size allocated in RAM? What will allow the buffer to expand to a greater size? Is this simply a hardware limitation?
I've also read that some OS are able to handle the size of the buffer better than other OS's. Is this the case? If so which OS would be better suited for running a node script that needs to fetch a lot of data via HTTPS requests?
Here's how I'm doing my requests.
for (let j = 0; j < dataQueries; j++) {
getData(function())
}
function getData(callback){
axios.get(url, config)
.then((res) => {
// parse res
callback(parsedRes(res))
}).catch(function(err) {
console.log("Spooky problem alert! : " + err);
})
}
I've omitted some code for brevity, but this is generally how I'm doing my requests. I have a for loop that for every iteration launches a GET request via axios.
I know there is an axios.all command that is used for storing the promise the axios.HTTPMethod gives you, but I saw no change in my code when I set it up to store promises and then iterate over the promises via axios.all
Thanks #Jonasw for your help, but there is a very simple solution to this problem.
I used the small library throttled-queue to get the job done. (If you look at the source code it would be pretty easy to implement your own queue based on this package.
My code changed to:
const throttledQueue = require('throttled-queue')
let throttle = throttledQueue(15, 1000) // 15 times per second
for (let j = 0; j < dataQueries; j++) {\
throttle(function(){
getData(function(res){
// do parsing
})
}
}
function getData(callback){
axios.get(url, config)
.then((res) => {
// parse res
callback(parsedRes(res))
}).catch(function(err) {
console.log("Spooky problem alert! : " + err);
})
}
In my case this got resolved by deleting the autogenerated zip files from my workspace, which got created every time I did cdk deploy. Turns out that my typescript compiler treated these files as source files and counted them into the tarball.
Youre starting a lot of data Queries at the same time. You could chain them up using a partly recursive function, so that theyre executed one after another:
(function proceedwith(j) {
getData(function(){
if(j<dataQueries-1) proceedwith(j+1);
});
})(0)
Experienced the same issue when starting too many requests.
Tried throttled-queue, but wasn't working correctly.
system-sleep worked for me, effectively slowing down the rate at which the requests were made. Sleep is best used in synchronized code, to block before using sync/async code.
Example: (using sleep to limit the rate updateAddress() is called)
// Asynchronus call (what is important is that forEach is synchronous)
con.query(sql, function (err, result) {
if (err) throw err;
result.forEach(function(element) {
sleep(120); // Blocking call sleep for 120ms
updateAddress(element.address); // Another async call (network request)
});
});

Parameters not being passed through Request in NodeJs

I am trying to make a request with the request package and can't seem to be able to pass through a simple parameter.
Anyone know what would be the best way to pass it through?
asyncRefreshToken()
.then(function(token){
console.log('Got the token! ' + token);
for(var k=0; k<2; k++){
var url= 'https://www.googleapis.com/analytics/v3/data/realtime?ids=ga:'+brandsConfig.brands[k].profileId+'&metrics=rt%3AactiveUsers&dimensions=rt%3ApagePath&sort=-rt%3AactiveUsers&access_token='+token;
var options = {
url: url,
method: 'GET'
}
request(options, function (error, response, body) {
if (!error && response.statusCode == 200) {
// Print out the response body
var parsed = JSON.parse(body);
var activeUsers = parsed.totalResults;
console.log(brandsConfig.brands[k].title + ': ' + activeUsers);
}
})
}
})
Sorry, I should be more specific - brandsConfig.brands[k].title will only return the last value i.e. brandsConfig.brands[1].title
What I am trying to achieve:
Once a token has been obtained (from asyncRefreshToken), use the request package to query the Google Analytics API for a list of brands.
The brands are in an array brandsConfig.brands[k], the corresponding title can be obtained from brandsConfig.brands[k].title
The result for now, during the time I'm trying to learn can just be in the console.
So ideal result:
* Got the token! 1234567890
* Brand 1 : 582432
* Brand 2 : 523423
Current output:
* Got the token! 1234567890
* Brand 2 : 582432
* Brand 2 : 523423
Your problem is caused by the combination of a for loop and an asynchronous request. What's happening is that your loop begins, and kicks off the first request. The request is asynchronous (since it's over ye olde interwebs). This means that the code in the callback will not be executed right away, it will be "skipped" until the asynchronous request returns. The important thing is that your for loop keeps executing, incrementing k, and kicking of a new request. Now your code has finished except for the callbacks to the two requests.
Now the first one comes back. It executes the code in the callback. What is the value of k? Well since the loop kept going, the value is now 1. Same thing happens to the second request, k is still 1.
The important thing is that a callback does not create it's own context that only it can touch.
There are 3 ways out of this: figure out a way that does not put an async operation in the for loop, use the async library, or learn about closures (read 3 different explanations to get a good intuition on this one).

Get Cloudflare's HTTP_CF_IPCOUNTRY header with javascript?

There are many SO questions how to get http headers with javascript, but for some reason they don't show up HTTP_CF_IPCOUNTRY header.
If I try to do with php echo $_SERVER["HTTP_CF_IPCOUNTRY"];, it works, so CF is working just fine.
Is it possible to get this header with javascript?
#Quentin's answer stands correct and holds true for any javascript client trying to access server header's.
However, since this question is specific to Cloudlfare and specific to getting the 2 letter country ISO normally in the HTTP_CF_IPCOUNTRY header, I believe I have a work-around that best befits the question asked.
Below is a code excerpt that I use on my frontend Ember App, sitting behind Cloudflare... and varnish... and fastboot...
function parseTrace(url){
let trace = [];
$.ajax(url,
{
success: function(response){
let lines = response.split('\n');
let keyValue;
lines.forEach(function(line){
keyValue = line.split('=');
trace[keyValue[0]] = decodeURIComponent(keyValue[1] || '');
if(keyValue[0] === 'loc' && trace['loc'] !== 'XX'){
alert(trace['loc']);
}
if(keyValue[0] === 'ip'){
alert(trace['ip']);
}
});
return trace;
},
error: function(){
return trace;
}
}
);
};
let cfTrace = parseTrace('/cdn-cgi/trace');
The performance is really really great, don't be afraid to call this function even before you call other APIs or functions. I have found it to be as quick or sometimes even quicker than retrieving static resources from Cloudflare's cache. You can run a profile on Pingdom to confirm this.
Assuming you are talking about client side JavaScript: no, it isn't possible.
The browser makes an HTTP request to the server.
The server notices what IP address the request came from
The server looks up that IP address in a database and finds the matching country
The server passes that country to PHP
The data never even goes near the browser.
For JavaScript to access it, you would need to read it with server side code and then put it in a response back to the browser.
fetch('https://cloudflare-quic.com/b/headers').then(res=>res.json()).then(data=>{console.log(data.headers['Cf-Ipcountry'])})
Reference:
https://cloudflare-quic.com/b
https://cloudflare-quic.com/b/headers
Useful Links:
https://www.cloudflare.com/cdn-cgi/trace
https://github.com/fawazahmed0/cloudflare-trace-api
Yes you have to hit the server - but it doesn't have to be YOUR server.
I have a shopping cart where pretty much everything is cached by Cloudflare - so I felt it would be stupid to go to MY server to get just the countrycode.
Instead I am using a webworker on Cloudflare (additional charges):
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
var countryCode = request.headers.get('CF-IPCountry');
return new Response(
JSON.stringify({ countryCode }),
{ headers: {
"Content-Type": "application/json"
}});
}
You can map this script to a route such as /api/countrycode and then when your client makes an HTTP request it will return essentially instantly (for me it's about 10ms).
/api/countrycode
{
"countryCode": "US"
}
Couple additional things:
You can't use webworkers on all service levels
It would be best to deploy an actual webservice on the same URL as a backup (if webworkers aren't enabled or supported or for during development)
There are charges but they should be neglibible
It seems like there's a new feature where you can map a single path to a single script. That's what I am doing here. I think this used to be an enterprise only feature but it's now available to me so that's great.
Don't forget that it may be T1 for TOR network
Since I wrote this they've exposed more properties on Request.cf - even on lower priced plans:
https://developers.cloudflare.com/workers/runtime-apis/request#incomingrequestcfproperties
You can now get city, region and even longitude and latitude, without having to use a geo lookup database.
I've taken Don Omondi's answer, and converted it to a promise function for ease of use.
function get_country_code() {
return new Promise((resolve, reject) => {
var trace = [];
jQuery.ajax('/cdn-cgi/trace', {
success: function(response) {
var lines = response.split('\n');
var keyValue;
for (var index = 0; index < lines.length; index++) {
const line = lines[index];
keyValue = line.split('=');
trace[keyValue[0]] = decodeURIComponent(keyValue[1] || '');
if (keyValue[0] === 'loc' && trace['loc'] !== 'XX') {
return resolve(trace['loc']);
}
}
},
error: function() {
return reject(trace);
}
});
});
}
usage example
get_country_code().then((country_code) => {
// do something with the variable country_code
}).catch((err) => {
// caught the error, now do something with it
});

Categories