Slow third-party APIs clogging Express server

Slow third-party APIs clogging Express server - javascript

I am creating a question answering application using Node.js + Express for my back-end. Front-end sends the question data to the back-end, which in turn makes requests to multiple third-party APIs to get the answer data.
Problem is, some of those third-party APIs take too long to respond, since they have to do some intense processing and calculations. For that reason, i have already implemented a caching system that saves answer data for each different question. Nevertheless, that first request each time might take up to 5 minutes.
Since my back-end server waits and does not respond back to the front-end until data arrives (the connections are being kept open), it can only serve 6 requests concurrently (that's what I have found). This is unacceptable in terms of performance.
What would be a workaround to this problem? Is there a way to not "clog" the server, so it can serve more than 6 users?
Is there a design pattern, in which the servers gives an initial response, and then serves the full data?
Perhaps, something that sets the request to "sleep" and opens up space for new connections?

Your server can serve many thousands of simultaneous requests if things are coded properly and it's not CPU intensive, just waiting for network responses. This is something that node.js is particularly good at.
A single browser, however, will only send a few requests at a time (it varies by browser) to the same endpoint (queuing the others until the earlier ones finish). So, my guess is that you're trying to test this from a single browser. That's not going to test what you really want to test because the browser itself is limiting the number of simultaneous requests. node.js is particularly good at having lots of request in flight at the same time. It can easily do thousands.
But, if you really have an operation that takes up to 5 minutes, that probably won't even work for an http request from a browser because the browser will probably time out an inactive connection still waiting for a result.
I can think of a couple possible solutions:
First, you could make the first http request be to just start the process and have it return immediately with an ID. Then, the client can check every 30 seconds of so after that sending the ID in an http request and your server can respond whether it has the result yet or not for that ID. This would be a client-polling solution.
Second, you could establish a webSocket or socket.io connection from client to server. Then, send a message over that socket to start the request. Then, whenever the server finishes its work, it can just send the result directly to the client over the webSocket or socket.io connection. After receiving the response, the client can either keep the webSocket/socket.io connection open for use again in the future or it can close it.

Related

If i used setInterval to request data from database, May it damage the server?

I'm trying to develop chat system in php, sql and ajax. I created function by ajax to get messages from database this function its event when window upload, so if i open 2 windows in browser to test the application, I found the messages bu when i send message it appear in just the window which send from not both of the 2 windows. To solve this problem i used setInterval function every 1 second to show messages.
Do this huge requests damage the server ??

I don't quite know what you meant with "Damage", but nothing can be really damaged by a few extra requests.
If you're wondering whether the webserver can handle the load, it really depends on how many chat sessions are going at the same time. Any decent web server should be able to handle a lot more than two requests per second. If you have thousands of chat sessions open, or you have very CPU intensive code, then you may notice issues.
A bigger issue may be your network latency. If your network takes more than a second for a round-trip communication with the server, then you may end up with multiple requests coming from the same client at the same time.

How often send Http request to server to check for updates? (Ajax)

I'm developing a small app using Ajax and http requests.
Currently i'm sending one request each second to server for checking if there are updates, and if there are, to get them and download data.
This timing profile is modified when a user interact with the app, but it's negligible.
I'm just wondering.. i could send an infinite loop of requests to the server. more the requests are often, more the app will be speedy. but then doesn't server get too many requests?
but how much is the right time from a request to another one?
I've read something about tokens, but i can't understand which is the better way to check if servers have some updates. thanks in advance

Long polling is one of the main options here. You should look into the server and see that there is good support for persistent HTTP connections if you expect to have a large number of users persistently connected.
For example, Apache webserver on its own opens a thread per connection, that can be a notable challenge with regard to many users with persistent connections, you end up with lots of threads (there are approaches to address it in Apache which you can research further if need be). Jetty, a java based web server (among my personal favorites), uses a more advanced network library that scales connections to threads much more logarithmically and efficiently, essentially putting connections to sleep until traffic in/out is detected.
Here are a few link:
http://en.wikipedia.org/wiki/Push_technology
http://en.wikipedia.org/wiki/Comet_(programming)
http://www.einfobuzz.com/2011/09/reverse-ajax-comet-server-push.html

Is it better to compose multiple AJAX calls parallel or serial?

I'm developing a single-page application, which sends multiple AJAX request to the server.
The system works with polling, because some data-request can take about 10-20minutes to calculate.
client asks server for data
server hands out a job-id
client asks server every few seconds for the result
The polling algorithm lowers the polling frequency over time, stopping at intervals of 10seconds.
But when a client sends different data requests in a short time, he ends up with about 10-20 job-ids and starts polling for all of them.
Is it better to simply do it this way and let the browser handle those requests in parallel or should I schedule every request and serialize them all?
Would it bring performance benefits to serialize them?

If each initial request returns a unique id and each page has a unique user id then you can poll on what information for each request.
In the JSON I would return the results for any completed request, and the current status of those that haven't completed, such as whether it has started being processed, and perhaps a percentage of completion, or how many requests are ahead of that request.
This will simplify the work as you won't be making several polling calls, but just one, getting back a complex result to give feedback to the user the status of each request.
I find it useful to give some information on status for long-running queries otherwise the user may think the request was lost.

Some months ago, I faced performance issues due to multiple ajax calls, but I haven't investigated deeper this topic since then : High latencies loading stores in an ExtJS 4.1 MVC application.

Why make the server push data?

Why make the server push data to get notifications, like using SingleR while it can be made client side?
Using a javascript timing event, that checks for recent updates at specified time intervals user can get notifications as long as he remains connected to the server.
So my question is why do more work at the server that the client can already do?

It's not more work to the server, it's less work. Suppose that you have 10000 clients (and the number could easily be in the 100K or even millions for popular web-sites) polling the server every X seconds to find out if there's new data available for them. The server would have to handle 10000 requests every X seconds even if there's no new data to return to the clients. That's huge overhead.
When the server pushes updates to the clients, the server knows when an update is available and it can send it to just the clients this data is relevant to. This reduces the network traffic significantly.
In addition it makes the client code much simpler, but I think the server is the critical concern here.

First if you didn't use server push you will not get instant update for example you can't do chat application, second why bothering the client to do job that it is not designed to do it? third you will have performance issue on the client cause like #Ash said server is a lot more powerful than a client computer.

Why do we need web-sockets?

This is more of a n00b question, but I've never really known the answer.
so why do we need the websockets protocol?
and, what are the advantages over comet-style/long poll/hanging GET-style use of HTTP?

Comet and Ajax can both deliver end-user experiences that provide desktop-like functionality and low user-perceived latency, only Web Sockets lives up to the promise of providing a native means to accurately and efficiently stream events to and from the browser with negligible latency.
With polling it makes unnecessary requests and, as a result, many connections are opened and closed needlessly in low-message-rate situations.(as with polling it sends HTTP requests at regular intervals and immediately receives a response.)
Web Sockets remove the overhead and dramatically reduce complexity.

1-WebSocket is a naturally full-duplex,
bidirectional, single-socket connection. With WebSocket, your HTTP request becomes a
single request to open a WebSocket connection and reuses the same connection
from the client to the server, and the server to the client.
2-WebSocket reduces latency. For example, unlike polling,
WebSocket makes a single request. The server does not need to wait for a request from
the client. Similarly, the client can send messages to the server at any time. This single
request greatly reduces latency over polling, which sends a request at intervals, regardless
of whether messages are available.
3-WebSocket makes real-time communication much more efficient.
You can always use polling (and sometimes even streaming) over HTTP to receive
notifications over HTTP. However, WebSocket saves bandwidth, CPU power, and latency.
WebSocket is an innovation in performance.
4-WebSocket is an underlying network protocol that enables you to build other standard
protocols on top of it.
5-WebSocket is part of an effort to provide advanced capabilities to HTML5 applications in
order to compete with other platforms.
6-WebSocket is about Simplicity

Here's an article about benefits of websocket over polling at websocket.org

It isn't clear that we do need them. In the scenario of pushing events to the client, a page can make ordinary AJAX GET requests in a loop, and the server can "hang" until events are available. After some timeout passes the server can return a "no events" response so the client will reconnect. During the period where the connection is open and the client is waiting for a response, there is an effective push channel from the server back to the client.
The timeout period can be adjusted to reduce the amount of unnecessary reconnecting, though it cannot typically be infinite because most server frameworks will terminate the server-side process if it appears to have hung for too long.
Given this existing capability, the question is: does a new communication framework really add significant value over what can already be done? It wouldn't really enable something that cannot be done. It would only slightly improve it.

We Keep Coding

JavaScript is the programming language of the Web.