socket.io | are asynchronous responses stacked up in order?

socket.io | are asynchronous responses stacked up in order? - javascript

Client emits newdataFromClient to sever
server starts processing new data:
socket.on('newdataFromClient', async(newdataFromClient) => {
let result = await doSomething(newdataFromClient)
socket.emit('response', result)
})
while server is not done processing, client sends more data
Server will eventually emit 2 results and send them back to the client. One for each newdataFromClient it received.
Will the results be sent back in order or is whichever one finishes faster the one that will be sent back first ?
I'm running a basic node server on a Macbook Pro. If the server starts getting multiple newdataFromClient one after another, will it start handling each on separate threads and when it runs out of threads it will start stacking them up in order?
I assume that my server will crash anyway if it can't handle too many calls but that's a separate issue.
Here I'm only interested in the order of the server responses.

socket.io events will arrive in the order they were sent from the server. The underlying transport here is TCP which keeps packets in order.
Now, if the server itself is handling the individual arriving requests and in its processing to send a result, it has asynchronous operations, there may be no guarantee on the server for what order it will send the responses. In fact, it is likely that the completion order on the server is unpredictable.
Will the results be sent back in order or is whichever one finishes faster the one that will be sent back first ?
Whichever one finishes first on the server and sends a message back is the one that will be received first in the client. These are independent messages that simply go on the transport en-route to the client whenever they are emitted at the server. And the TCP transport which underlies the webSocket protocol which underlies the socket.io engine will keep these packets in the order they were originally sent from the server.
Ack feature in socket.io
socket.io does have a means of getting a specific "ack" back from a specific message. If you look at the socket.emit() doc, you will see that one of the optional arguments is an ack callback. That can be used (in conjunction with the server sending an ack response) to get a specific response to this specific message. For the details on how to implement, see both the client and server-side doc for that "ack" feature.
Absent using that built-in feature, you would have to build your own messageID based system so you can match a given response coming back from the server to the original request (that's what the "ack" feature does internally) because socket.io is not natively a request/response protocol (it's a message protocol).

Related

What is the best practice for building an API that takes a long time to respond?

I am building an API endpoint that has to call multiple external services/DBs and I do not want my users to have to wait for this process to take place, however, the result of this process is essential for my users.
My first thought is to add the request to a queue and return immediately, then at some later time, the user can query a different endpoint for the result.
Is there a better way to go about this? Should there be a webhook response instead of asking users to query the API twice?

Three main ways I've seen:
Client sends the API request and immediately gets back a job number. The client can then send a different API request with that job number every so often (every minute or so depending upon how long the usual result takes to get) so check on the progress. On one of those checks the job will be done and the get the data.
Client makes a webSocket or socket.io connection. Client sends a request over that websocket/socket.io connection. Server starts working on the result. When the result is done, it is immediately sent over the webSocket/socket.io connection back to the client. The client can then keep the websocket/socket.io connection connected for other requests or close the connection.
Use Server-Sent events. Send, the query and then when the result is done, the server can send it back on that same connection.
I don't think there's a "best practice" among these three as each have some advantages and some other uses which may be relevant. The polling option #1 is the lowest common denominator and will work in any situation, but requires a polling strategy by the client and may have some latency (result ready before client polls).
The choices #2 and #3 are both very efficient and their general technology may have other uses also.

Some of Server Sent events lost while EventSource reconnecting

In our application we have SSE connection with living time 5 minutes, after 5 minutes server closes the connection and client reconnect automatically.
But here the problem: while client reconnecting, there might some event happened on backend and it will not be passed to SSE connection, because it’s not established yet.
So there are some time slots 1-2sec when we may loose events.
How we can handle this case ? What is your opinion ?
From my vision we only have one choice: after every SSE reconnect do additional GET requests on server to refresh data.

This is exactly what the Last-Event-ID HTTP header in the SSE protocol is designed for.
On the server side you should look for that header when you get a new connection. If it is set, stream the missing data gap to them, immediately. And you should set the id header for each message you push out, to some unique identifier.
On the client side, for your particular use case, you do not need to do anything: when SSE reconnect runs it sends that header automatically, using the id of the last data it had received.
In chapter 5 of my book Data Push Apps with HTML5 SSE, I argue you should also include that same unique id, explcitly, in the JSON data packet you push out, and you should support the Last-Event-ID being given as a POST/GET argument as well. This gives you the flexibility to work with the long-polling alternative approaches to SSE, and also means it can work if the reconnect came from the client-side rather than the server-side. (The former would be for supporting older browsers, though that matter less and less as IE dies out; the latter would be needed if you implement your own keep-alive mechanism.)

You can queue of events in the server and deque the events when the client is active.
Regardless of the client's connection status, just add all the events to queue.
When the client is connected, deque all the events from the queue.
Instead of sending the message directly to clients the application
sends it to the broker. Then the broker sends the message to all
subscribers (which may include the original sender) and they send it
to the clients.
Refer https://www.tpeczek.com/2017/09/server-sent-events-or-websockets.html

How Websockets are implemented?

How Websockets are implemented?
What is the algorithm behind this new tech (in comparison to Long-Polling)?
How can they be better than Long-Polling in term of performance?
I am asking these questions because here we have a sample code of Jetty websocket implementation (server-side).
If we wait long enough, a timeout will occur, resulting in the
following message on the client.
And that is definately the problem I'm facing when using Long-polling. It stops the process to prevent server overload, doesn't it ?

How Websockets are implemented?
webSockets are implemented as follows:
Client makes HTTP request to server with "upgrade" header on the request
If server agrees to the upgrade, then client and server exchange some security credentials and the protocol on the existing TCP socket is switched from HTTP to webSocket.
There is now a lasting open TCP socket connecting client and server.
Either side can send data on this open socket at any time.
All data must be sent in a very specific webSocket packet format.
Because the socket is kept open as long as both sides agree, this gives the server a channel to "push" information to the client whenever there is something new to send. This is generally much more efficient than using client-driven Ajax calls where the client has to regularly poll for new information. And, if the client needs to send lots of messages to the server (perhaps something like a mnulti-player game), then using an already open socket to send a quick message to the server is also more efficient than an Ajax call.
Because of the way webSockets are initiated (starting with an HTTP request and then repurposing that socket), they are 100% compatible with existing web infrastructure and can even run on the same port as your existing web requests (e.g. port 80 or 443). This makes cross-origin security simpler and keeps anyone on either client or server side infrastructure from having to modify any infrastructure to support webSocket connections.
What is the algorithm behind this new tech (in comparison to
Long-Polling)?
There's a very good summary of how the webSocket connection algorithm and webSocket data format works here in this article: Writing WebSocket Servers.
How can they be better than Long-Polling in term of performance?
By its very nature, long-polling is a bit of a hack. It was invented because there was no better alternative for server-initiated data sent to the client. Here are the steps:
The client makes an http request for new data from the client.
If the server has some new data, it returns that data immediately and then the client makes another http request asking for more data. If the server doesn't have new data, then it just hangs onto the connection for awhile without providing a response, leaving the request pending (the socket is open, the client is waiting for a response).
If, at any time while the request is still pending, the server gets some data, then it forms that data into a response and returns a response for the pending request.
If no data comes in for awhile, then eventually the request will timeout. At that point, the client will realize that no new data was returned and it will start a new request.
Rinse, lather, repeat. Each piece of data returned or each timeout of a pending request is then followed by another ajax request from the client.
So, while a webSocket uses one long-lived socket over which either client or server can send data to the other, the long-polling consists of the client asking the server "do you have any more data for me?" over and over and over, each with a new http request.
Long polling works when done right, it's just not as efficient on the server infrastructure, bandwidth usage, mobile battery life, etc...
What I want is explanation about this: the fact Websockets keep an
open connection between C/S isn't quite the same to Long Polling wait
process? In other words, why Websockets don't overload the server?
Maintaining an open webSocket connection between client and server is a very inexpensive thing for the server to do (it's just a TCP socket). An inactive, but open TCP socket takes no server CPU and only a very small amount of memory to keep track of the socket. Properly configured servers can hold hundreds of thousands of open sockets at a time.
On the other hand a client doing long-polling, even one for which there is no new information to be sent to it, will have to regularly re-establish its connection. Each time it re-establishes a new connection, there's a TCP socket teardown and new connection and then an incoming HTTP request to handle.
Here are some useful references on the topic of scaling:
600k concurrent websocket connections on AWS using Node.js
Node.js w/1M concurrent connections!
HTML5 WebSocket: A Quantum Leap in Scalability for the Web
Do HTML WebSockets maintain an open connection for each client? Does this scale?

Very good explanation about web sockets, long polling and other approaches:
In what situations would AJAX long/short polling be preferred over HTML5 WebSockets?
Long poll - request → wait → response. Creates connection to server like AJAX does, but keep-alive connection open for some time (not long though), during connection open client can receive data from server. Client have to reconnect periodically after connection is closed due to timeouts or data eof. On server side it is still treated like HTTP request same as AJAX, except the answer on request will happen now or some time in the future defined by application logic. Supported in all major browsers.
WebSockets - client ↔ server. Create TCP connection to server, and keep it as long as needed. Server or client can easily close it. Client goes through HTTP compatible handshake process, if it succeeds, then server and client can exchange data both directions at any time. It is very efficient if application requires frequent data exchange in both ways. WebSockets do have data framing that includes masking for each message sent from client to server so data is simply encrypted. support chart (very good)
Overall, sockets have much better performance than long polling and you should use them instead of long polling.

Websocket Communication Latency Questions

3 mini questions regarding websocket connections
When the client sends data to the server there is latency. When the server sends data to the client is there latency or it is instant?
If the client sends data to the server VERY FAST in a specific row - let's say [1, 2, 3], is there any chance that, due to latency or other reasons, the data to be received by the server in a different row? ( like [2, 1, 3] )
(Same as question #2, but when the server sends the data)

Yes, there is latency. Its still a connection and there is still a chain to navigate. Latency only matters when things are changing and given that it takes X amount of time for the message to reach the client and another X ms for the client to do anything about it, its quite possible the state will change during those ms. In the same way that HTTP requests (WebSockets are just about the same thing) become 'hot', I believe the latency will diminish (all other things being equal) but it will still exist.
No, WebSockets are via TCP, so they'll be in order. UDP transport is fire-and-forget, it doesnt send any notification of receipt and it doesnt regenerate the packets using timing information, so you can send messages faster but can make no assumptions regarding receipt or order or events. Page impressions would be a great example of where you dont care really in what order and you probably dont care too much about when the server receives such a message, WebRTC may bring UDP connections between JS and server but the standard is still emerging. For now, WebSockets connect via an HTTP upgrade, meaning they are TCP, where order information and receipt confirmation is a thing (larger and more messages being sent to and fro).
Same answer! It all happens over TCP so the whole trip is a round-trip, but order is assured.

How activemq works

I read on the ActiveMQ official page how it works, but could not understand the whole scenario how request and response is going on.
As per my understanding now, if I have a servelet on server and JavaScript as client using amq.js, then
JavaScript sends a poll request to server.
Server starts a thread and checks for data to be sent as response.
If data is not available at that time, server waits till there is any data.
Server sends the data when available and then the connection breaks.
Client receives the data and again send the poll request.
In this way the client request is parked at the server till the data is received.
Is this understanding correct and possible?
If yes, how the request is parked at the server?
Thanks.

Yes, you understand it correctly. But with the restriction that the request will be on hold for 30 seconds, then it times out (default).
The request is parked at the server using Jetty Continuations, as Jetty is the servlet container in ActiveMQ.
Since ActiveMQ, java side, can be setup with asynchronous listeners, there does not need to be on thread blocked for the entire poll.

We Keep Coding

JavaScript is the programming language of the Web.