How are real-time notifications implemented in twitter?

How are real-time notifications implemented in twitter? - javascript

How twitter page queries/receives notifications, information about new tweets?
I'd like to implement something like this mechanism for my html+js client->webservice

I don't know what Twitter uses exactly but there are few techniques to handle server notifications.
You can use long-polling (your client issues the same ajax request every few seconds to get new information):
http://techoctave.com/c7/posts/60-simple-long-polling-example-with-javascript-and-jquery/
Or there is the "new" standard called Websocket. A good start to how to write a websocket client is this mozilla tutorial.

There are multiple ways to implement real-time notifications:
HTTP Long Polling : The client initiates a request. The server checks if it has any new notifications. Irrespective of whether or not it has new notifications appropriate response is send and connection is closed. After time X client initiates another request (+ Very easy to implement - notifications are not real time. They depend on X since data retrieval is client initiated. As X decreases overhead on server increases )
HTTP Streaming: This is very similar to HTTP Long Polling however the connection is not closed. The server sends chunked response. So as soon as server receives new notification that it wants to push it can simply write to the socket. ( + lower latency than long polling and almost real time behaviour / overhead of closing connection and re opening reduced - memory usage client side keeps on piling up / ugly hacks etc )
WebSocket: TCP based protocol provides true two way communication. The server can push data to client any time. ( + ve: true real time - some older browsers dont support it ). Read more about it WebSocket.org | About WebSocket

Related

Client access vs broadcast data from web server

I'm looking for technique or skils to fix the ways for new web site.
This site show the read time data which located on server as file or data on memory.
I'll use Node.js for server-side. But I can't fix how to get the data and show that to web site user.
Because this data have to update per 1 second at least.
I think it is similar to the stock price page.
I know there are a lot of ways to access data like AJAX, Angular.js, Socket.io..
Also each has pros and cons.
Which platform or framework is good in this situation?

This ultimately depends on how much control you have over the server side. For data that needs to be refreshed every second, doing the polling on client side would place quite the load on the browser.
For instance, you could do it by simply using one of the many available frameworks to make http requests inside some form of interval. The downsides to this approach include:
the interval needs to be run in the background all the time while the user is on the page
the http request needs to be made for every interval to check if the data has changed
comparison of data also needs to be performed by the browser, which can be quite heavy at 1 sec intervals
If you have some server control, it would be advisable to poll the data source on the server, i.e. using a proxying microservice, and use the server to perform change checking and only send data to clients when it has changed.
You could use Websockets to communicate those changes via a "push" style message instead of making the client browser do the heavy lifting. The flow would go something like:
server starts polling when a new client starts listening on its socket
server makes http requests for each polling interval, runs comparison for each result
when result has changed, server broadcasts a socket message to all connected clients with new data
The main advantage to this is that all the client needs to do is "connect and listen". This even works with data sources you don't control – the server you provide can perform any data manipulation needed before it sends a message to the client, the source just needs to provide data when requested.
EDIT: just published a small library that accomplishes this goal: Mighty Polling ⚡️ Socket Server. Still young, examine for your use if using.

Is it good - use setInterval for data refreshing?

I have an issue - I should update information for user as soon as possible, but i don't know exact time when it'll happen.
I use setInterval function that checks differences between current state and the state before checking. If there are any differences then I send an AJAX request and update info. Is it bad? I can't (or don't know how to) listen any events in that case.
And what about interval time? All users (~300 at the same time) are from local network (ping 15-20 ms). I have to refresh information immediately. Should I better use 50ms or 500ms?
If the question is not very clear just ask - I'll try to say it in other words.
Thanks in advance

Solution: Websocket
Websockets allow client applications to respond to messages initiated from the server (compare this with HTTP where the client needs to first ask the server for data via a request). A good solution would be to utilize a websocket library or framework. On the client you'll need to create a websocket connection with the server, and on the server you'll need to alert any open websockets whenever an update occurs.
The issue with interval
It doesn't scale, you could set the interval to 4000 miliseconds and still once you hit 1000 users...you are going to be slamming your server with 10000 requests and responses a minute...This will use tons of data and use processing to return nothing. Websockets will only send data to the client agent only when the event you want to send actually occurs.
Backend: PHP
Frameworks
Ratchet
Ratchet SourceCode
phpwebsocket
PHP-Websockets-Server
Simply implement one of the above frameworks as a websocket connection then you will register as a client to this endpoint and it will send data on whatever event you define.

How Websockets are implemented?

How Websockets are implemented?
What is the algorithm behind this new tech (in comparison to Long-Polling)?
How can they be better than Long-Polling in term of performance?
I am asking these questions because here we have a sample code of Jetty websocket implementation (server-side).
If we wait long enough, a timeout will occur, resulting in the
following message on the client.
And that is definately the problem I'm facing when using Long-polling. It stops the process to prevent server overload, doesn't it ?

How Websockets are implemented?
webSockets are implemented as follows:
Client makes HTTP request to server with "upgrade" header on the request
If server agrees to the upgrade, then client and server exchange some security credentials and the protocol on the existing TCP socket is switched from HTTP to webSocket.
There is now a lasting open TCP socket connecting client and server.
Either side can send data on this open socket at any time.
All data must be sent in a very specific webSocket packet format.
Because the socket is kept open as long as both sides agree, this gives the server a channel to "push" information to the client whenever there is something new to send. This is generally much more efficient than using client-driven Ajax calls where the client has to regularly poll for new information. And, if the client needs to send lots of messages to the server (perhaps something like a mnulti-player game), then using an already open socket to send a quick message to the server is also more efficient than an Ajax call.
Because of the way webSockets are initiated (starting with an HTTP request and then repurposing that socket), they are 100% compatible with existing web infrastructure and can even run on the same port as your existing web requests (e.g. port 80 or 443). This makes cross-origin security simpler and keeps anyone on either client or server side infrastructure from having to modify any infrastructure to support webSocket connections.
What is the algorithm behind this new tech (in comparison to
Long-Polling)?
There's a very good summary of how the webSocket connection algorithm and webSocket data format works here in this article: Writing WebSocket Servers.
How can they be better than Long-Polling in term of performance?
By its very nature, long-polling is a bit of a hack. It was invented because there was no better alternative for server-initiated data sent to the client. Here are the steps:
The client makes an http request for new data from the client.
If the server has some new data, it returns that data immediately and then the client makes another http request asking for more data. If the server doesn't have new data, then it just hangs onto the connection for awhile without providing a response, leaving the request pending (the socket is open, the client is waiting for a response).
If, at any time while the request is still pending, the server gets some data, then it forms that data into a response and returns a response for the pending request.
If no data comes in for awhile, then eventually the request will timeout. At that point, the client will realize that no new data was returned and it will start a new request.
Rinse, lather, repeat. Each piece of data returned or each timeout of a pending request is then followed by another ajax request from the client.
So, while a webSocket uses one long-lived socket over which either client or server can send data to the other, the long-polling consists of the client asking the server "do you have any more data for me?" over and over and over, each with a new http request.
Long polling works when done right, it's just not as efficient on the server infrastructure, bandwidth usage, mobile battery life, etc...
What I want is explanation about this: the fact Websockets keep an
open connection between C/S isn't quite the same to Long Polling wait
process? In other words, why Websockets don't overload the server?
Maintaining an open webSocket connection between client and server is a very inexpensive thing for the server to do (it's just a TCP socket). An inactive, but open TCP socket takes no server CPU and only a very small amount of memory to keep track of the socket. Properly configured servers can hold hundreds of thousands of open sockets at a time.
On the other hand a client doing long-polling, even one for which there is no new information to be sent to it, will have to regularly re-establish its connection. Each time it re-establishes a new connection, there's a TCP socket teardown and new connection and then an incoming HTTP request to handle.
Here are some useful references on the topic of scaling:
600k concurrent websocket connections on AWS using Node.js
Node.js w/1M concurrent connections!
HTML5 WebSocket: A Quantum Leap in Scalability for the Web
Do HTML WebSockets maintain an open connection for each client? Does this scale?

Very good explanation about web sockets, long polling and other approaches:
In what situations would AJAX long/short polling be preferred over HTML5 WebSockets?
Long poll - request → wait → response. Creates connection to server like AJAX does, but keep-alive connection open for some time (not long though), during connection open client can receive data from server. Client have to reconnect periodically after connection is closed due to timeouts or data eof. On server side it is still treated like HTTP request same as AJAX, except the answer on request will happen now or some time in the future defined by application logic. Supported in all major browsers.
WebSockets - client ↔ server. Create TCP connection to server, and keep it as long as needed. Server or client can easily close it. Client goes through HTTP compatible handshake process, if it succeeds, then server and client can exchange data both directions at any time. It is very efficient if application requires frequent data exchange in both ways. WebSockets do have data framing that includes masking for each message sent from client to server so data is simply encrypted. support chart (very good)
Overall, sockets have much better performance than long polling and you should use them instead of long polling.

Get Refesh Data From The Server

How can i get latest or fresh data from server (if in server happened new event (for example there are 2 users x,y and x send messages to y and y get this message without refreshing page ) )?
I don't want to use setInterval because it repeats all message again and again. Is there any Technique that can i use for this ?
I heard about Ajax that technique need to send request to the server but i want when happen an event in the server and webpage get it without refreshing..

The first technique is the long polling, which sends request to the server and waits until the server sends something, for example the new message. You must re-send requests to the server each time you get a new message or your request is time out. This technique uses AJAX.
Long polling PHP example - How do I implement basic "Long Polling"?
The second is web sockets, https://en.wikipedia.org/wiki/WebSocket
this stackoverflow question deals with the implementation of websocket.
socket.io has a demo of chat application.

If you looking for bidirectional full duplex method then go for WebSockets but for just polling data from server you can use Server Sent Event as well. Adding reference links for both:
WebSocket:
http://html5demos.com/web-socket
http://en.wikipedia.org/wiki/WebSocket
HTML5 Websockets for Realtime Chat app?
SSE:
http://www.html5rocks.com/en/tutorials/eventsource/basics/
Examples:
SSE: http://demo.howopensource.com/sse/

Why make the server push data?

Why make the server push data to get notifications, like using SingleR while it can be made client side?
Using a javascript timing event, that checks for recent updates at specified time intervals user can get notifications as long as he remains connected to the server.
So my question is why do more work at the server that the client can already do?

It's not more work to the server, it's less work. Suppose that you have 10000 clients (and the number could easily be in the 100K or even millions for popular web-sites) polling the server every X seconds to find out if there's new data available for them. The server would have to handle 10000 requests every X seconds even if there's no new data to return to the clients. That's huge overhead.
When the server pushes updates to the clients, the server knows when an update is available and it can send it to just the clients this data is relevant to. This reduces the network traffic significantly.
In addition it makes the client code much simpler, but I think the server is the critical concern here.

First if you didn't use server push you will not get instant update for example you can't do chat application, second why bothering the client to do job that it is not designed to do it? third you will have performance issue on the client cause like #Ash said server is a lot more powerful than a client computer.

We Keep Coding

JavaScript is the programming language of the Web.