How does Twitter send websockets to it's clients - javascript

There are millions of tweets and millions of active users in the Twitter. When a tweet gets like or retweet,how do they send live updates(websockets) of every tweet to its clients?
I think they wouldn't send live updates(websockets) of each tweet to every active user, that would result in (no of active tweets)X(no of active users)=(millions)X(millions)>10^12 live updates in each minute, each user would get millions of updates(of all the tweets) in each minute.
I think the live update of a particular tweet would only be received by the users who are watching that particular tweet.If this assumption is correct,then please tell me, how do they filter clients who are watching a particular tweet and send live updates of that tweet only to those filtered clients?
I was just watching a tweet in the Twitter, I was surprised to see live updates in likes and retweets of that tweet.I haven't seen any social media(like Instagram) giving live updates for every single post of it. I want to implement this method in my social media website.What I had concluded might or might not be correct, but I would request you to explain me, how does Twitter send live updates of every single tweet only to those particular users who are watching it.

To be clear, ONE device has ONE socket connection, to Twitter's cloud.
That ONE socket connection, receives ALL information from Twitter's cloud
new tweets
new likes
new retweets
everything else
all information comes on the ONE socket.
The cloud "figures out" what to send to who.
Is this what you were asking? Hope it clears it up.
The amazing thing is that twitter's cloud can connect to perhaps 100 ? million devices at the same time. (This is an amazing, major engineering achievement which requires an incredible amount of hardware, money and engineers.)
BTW if you're trying to implement something like this for an experiment or client. These days it is inconceivable you'd try to write the server side to achiever this, from scratch. Services exist, which do exactly this - example pusher.com, pubnub.com and so on.
(Indeed, these realtime infrastructure services, are, the basic technology of our era - everything runs on them.)
Here's a glance at the mind-boggling effort involved in Twitter's cloud: https://blog.twitter.com/engineering/en_us/topics/infrastructure/2017/the-infrastructure-behind-twitter-scale.html

Realtime communication or what you refer to as 'live updates' is all a play of various low-level networking protocols. Here's a bit of background on the protocols in general just so you know what you are working with:
A regular REST API uses the HTTP as the underlying protocol for communication, which follows the request and response paradigm, meaning the communication involves the client requesting some data or resource from a server, and the server responding back to that client. This is what you usually see in a regular website that isn't really live but shows or does something following a button click or similar trigger from the user.
However, HTTP is a stateless protocol, so every request-response cycle will end up having to repeat the header and metadata information. This incurs additional latency in case of frequently repeated request-response cycles.
With WebSockets, although the communication still starts off as an initial HTTP handshake, it is further upgrades to follow the WebSockets protocol (i.e. if both the server and the client are compliant with the protocol as not all entities support the WebSockets protocol).
Now with WebSockets, it is possible to establish a full-duplex and persistent connection between the client and a server. This means that unlike a request and a response, the connection stays open for as long as the application is running (i.e. it’s persistent), and since it is full-duplex, two-way simultaneous communication is possible. Now the server is capable of initiating communication and 'push' some data to the client when new data (that the client is interested in) becomes available.
The WebSockets protocol is stateful and allows you to implement the Publish-Subscribe (or Pub/Sub) messaging pattern which is the primary concept used in the real-time technologies where you are able to get new updates in the form of server push without the client having to request (refresh the page) repeatedly. Examples of such applications other than Twitter are Uber-like vehicle location tracking, Push Notifications, Stock market prices updating in real-time, chat, multiplayer games, live online collaboration tools, etc.
You can check out a deep dive article on WebSockets which explains the history of this protocol, how it came into being, what it’s used for and how you can implement it yourself.
Another interesting one is SSE or Server-Sent Events which is a subscribe-only version of WebSockets and restricted to the web platform. You can use SSE to receive real-time push updates from servers, but this would be unidirectional as you can only receive updates via SSE and not really publish anything. Here's a video where I explain this in much more detail: https://www.youtube.com/watch?v=Z4ni7GsiIbs
You can implement these various protocols as required from scratch or use a distributed messaging service like Ably which not only provides the messaging infrastructure of these protocols but also offers other add-ons such as scalability, reliability, message ordering, protocol interoperability, etc, out of the box, which is essential for a production-level app.
Full disclaimer: I'm a Dev Advocate for Ably but I hope the info in my answer is useful to you nevertheless.

Related

Is it possible and plausible to implement a WebService over a WebRTC Data Channel?

Is it possible to implement a WebService over a WebRTC Data Channel ?
The idea is:
The client makes one https request to the server for signaling and session establishment
The client and the server start to communicate via a WebRTC DataChannel bidirectionally
Benefits?:
Performance ?
Requests goes over one connection and the standard allows for multiple datachannels over the same connection ( ports )
Flexible networking topologies
UDP
End to end encryption
The server can send events over the same connection
Load balancing could be implemented from a pool of servers client side without a load balancer , or all kinds of different solutions
Currently being debated the addition of DataChannels to Workers/Service Workers/ etc https://github.com/w3c/webrtc-extensions/issues/64
Drawbacks:
Application specific code for implementing request fragmentation and control over buffer limits
[EDIT 3] I don't know how much of a difference in terms of performance and cpu/memory usage will it be against HTTP/2 Stream
Ideas:
Clients could be read replicas of the data for sync, or any other applications that are suitable for orbit-db https://github.com/orbitdb/orbit-db in the public IPFS network, the benefit of using orbit-db is that only allows to the owner to make writes, then the server could additionally sign with his key all the data so that the clients could verify and trust it's from the server, that could offload the main server for reads, just an idea.
[EDIT]
I've found this repo: https://github.com/jsmouret/grpc-over-webrtc
amazing!
[EDIT2]
Changed Orbit-db idea and removed cluster IPFS after investigating a bit
[EDIT3]
After searching Fetch PROS for HTTP/2 i've found Fetch upload streaming with ReadableStreams, i don't know how much of a difference will it be to run GRPC (bidi) over a WebRTC DataChannel or a HTTP/2 Stream
https://www.chromestatus.com/feature/5274139738767360#:~:text=Fetch%20upload%20streaming%20lets%20web,things%20involved%20with%20network%20requests).
Very cool video explaining the feature: https://www.youtube.com/watch?v=G9PpImUEeUA
Lots of different points here, will try to address them all.
The idea is 100% feasible. Check out Pion WebRTC's data-channels example. All it takes a single request/response to establish a connection.
Performance
Data channels are a much better fit if you are doing latency sensitive work.
With data channels you can measure backpressure. You can tell how much data has been delivered, and how much has has been queued. If the queue is getting full you know you are sending too much data. Other APIs in the browser don't give you this. There are some future APIs (WebTransport) but they aren't available yet.
Data channels allow unordered/unreliable delivery. With TCP everything you send will be delivered and in order, this issue is known as head-of-line blocking. That means if you lose a packet all subsequent packets must be delayed. An example would be if you sent 0 1 2 3, if packet 1 hasn't arrived yet 2 and 3 can't be processed yet. Data channels can be configured to give you packets as soon as they arrive.
I can't give you specific numbers on the CPU/Memory costs of running DTLS+SCTP vs TLS+WebSocket server. It depends on hardware/network you have, what the workload is etc...
Multiplexing
You can serve multiple DataChannel streams over a single WebRTC Connection (PeerConnection). You can also serve multiple PeerConnections over a single port.
Network Transport
WebRTC can be run over UDP or TCP
Load Balancing
This is harder (but not intractable) moving DTLS and SCTP sessions between servers isn't easy with existing libraries. With pion/dtls it has the support to export/resume a session. I don't know support in other libraries however.
TLS/Websocket is much easier to load balance.
End to end encryption
WebRTC has mandatory encryption. This is nice over HTTP 1.1 which might accidentally fall back to non-TLS if configured incorrectly.
If you want to route a message through the server (and not have the server see it) I don't think what protocol you use matters.
Topologies
WebRTC can be run in many different topologies. You can do P2P or Client/Server, and lots of things in between. Depending on what you are building you could build a hybrid mesh. You could create a graph of connections, and deploy servers as needed. This flexibility lets you do some interesting things.
Hopefully addressed all your points! Happy to discuss further in the comments/will keep editing the question.
I was also wondering about this HTTP-over-WebRTC DataChannel idea a couple of years ago. The problem at hand was how to securely connect from a web app to an IoT device (raspberry pi) that sits behind a firewall.
Since there was no readily available solution, I ended up building a prototype. It did the job and has been in live deployment since 2019.
See this technical blog post that covers the design and implementation in more detail:
https://webrtchacks.com/private-home-surveillance-with-the-webrtc-datachannel/
High level architecture:
Simplified sequence diagram:
Recently began the process of extracting the code into a standalone repo.
https://github.com/ambianic/peerfetch
If your main use-case exchanges small content, you may have a look at CoAP RFC 7252. A peer may easily implement both roles, client and server, though the exchanged messages for request and response share the same fomat.
For some advanced usage of DTLS 1.2, DTLS Connection ID can do some magic for you.
If you don't stick to javascript and java is an option, you may check the open source project Eclipse/Californium. That's a CoAP/DTLS implementation, which comes with DTLS Connection ID and some prepared advanced examples as built-in-cid-load-balancer-support or DTLS-graceful-restart.

How to make client-client connection in js without node.js?

I am trying to make simple game of Tic-Tac-Toe in JS.
I made almost everything. Now there is just one thing to do.
I would like to make it available to play online with someone.
I want to send data between two games via Internet.
Unfortunately my server does not support Node.JS.
Is there a way to make it happen without any server-side "socket".
I thought I could make it with XMLHttpRequest() for saving/loading data into/from server files and play like that, but I think it would require a lot of code and maybe for ttt it would be fast enough, but for more complicated games it would not be sufficient.
I know this is tough, but how did they do that before node.js?
For a game like Tic-Tac-Toe where players take turns, latency definitely takes second seat to every other factor in communications. For this reason alone, communicating with just the HTTP protocol, typically by utilizing the XMLHttpRequest class or the Fetch API, is a very reasonable approach which will save you a lot of programming effort.
Otherwise, when wanting one or several low-latency and/or RTC channels (for hopefully a good reason), both WebRTC and WebSocket are viable candidates.
WebRTC, for one, can absolutely do peer-to-peer, while WebSocket uses the client-server model. But even WebRTC requires a "signalling" service to exchange peer identifiers initially, before eventually switching to communicating between the peers directly. While peer identifiers are required to set up WebRTC communication, the API deliberately does not cover how peer identifiers are exchanged -- however you want to design your signalling service, is up to you. For all WebRTC cares, you can "POST" a peer ID to a HTTP server and retrieve it with the other peer's Web browser and vice-versa. WebRTC starts with already known peer IDs.
Otherwise, if configured to do so, WebRTC is able to utilize STUN and/or TURN services to maintain peer-to-peer connection, on networks that otherwise prohibit straightforward IP routing between any two clients -- a necessary prerequisite of true peer-to-peer communication.
STUN/TURN services aren't required in all cases, but knowing average network conditions, without using either STUN or TURN or both, your application wouldn't be very reliable for any two clients separated by multiple arbitrary networks. Like in scenarios where both parties are separated by at least one firewall or a stubborn router that functions as one.
A TURN service would then transparently route WebRTC communication, working as a relay.
A STUN service punches holes in the firewalls between clients in such a way that peer-to-peer communication is possible afterwards. Meaning that in contrast with a TURN service, it does not play any active part in communication after latter is established.
WebRTC is a bit complex, especially if you are expecting an API along the lines of send and receive, but a simplified connection example should be understandable to a developer.
You may also not need to use WebRTC API directly, there are libraries that encapsulate WebRTC into a simpler API of one flavour or another, API that simultaneously hides the more fringe or "boilerplate" aspects of WebRTC and which also helps minimize the risk of getting into trouble as different user agents are notorious for implementing different parts of WebRTC a bit differently.
One of these libraries is PeerJS but there are others, without a doubt.
The WebSocket API, unlike WebRTC, requires a WebSocket compliant server, and WebSocket API does not do peer-to-peer. The good news is that 1) a WebSocket compliant service is typically just an advanced relay (often fused with an application back-end logic), albeit working on the application level instead of the session level for TURN and 2) there are plenty of "turn-key" WebSocket server implementations out there.

Webrtc on fails on local network without internet connectivity [duplicate]

WebRTC signalling is driving me crazy. My use-case is quite simple: a bidirectional audio intercom between a kiosk and to a control room webapp. Both computers are on the same network. Neither has internet access, all machines have known static IPs.
Everything I read wants me to use STUN/TURN/ICE servers. The acronyms for this is endless, contributing to my migraine but if this were a standard application, I'd just open a port, tell the other client about it (I can do this via the webapp if I need to) and have the other connect.
Can I do this with WebRTC? Without running a dozen signalling servers?
For the sake of examples, how would you connect a browser running on 192.168.0.101 to one running on 192.168.0.102?
STUN/TURN is different from signaling.
STUN/TURN in WebRTC are used to gather ICE candidates. Signaling is used to transmit between these two PCs the session description (offer and answer).
You can use free STUN server (like stun.l.google.com or stun.services.mozilla.org). There are also free TURN servers, but not too many (these are resource expensive). One is numb.vigenie.ca.
Now there's no signaling server, because these are custom and can be done in many ways. Here's an article that I wrote. I ended up using Stomp now on client side and Spring on server side.
I guess you can tamper with SDP and inject the ICE candidates statically, but you'll still need to exchange SDP (and that's dinamycally generated each session) between these two PCs somehow. Even though, taking into account that the configuration will not change, I guess you can exchange it once (through the means of copy-paste :) ), stored it somewhere and use it every time.
If your end-points have static IPs then you can ignore STUN, TURN and ICE, which are just power-tools to drill holes in firewalls. Most people aren't that lucky.
Due to how WebRTC is structured, end-points do need a way to exchange call setup information (SDP) like media ports and key information ahead of time. How you get that information from A to B and back to A, is entirely up to you ("signaling server" is just a fancy word for this), but most people use something like a web socket server, the tic-tac-toe of client-initiated communication.
I think the simplest way to make this work on a private network without an internet connection is to install a basic web socket server on one of the machines.
As an example I recommend the very simple https://github.com/emannion/webrtc-web-socket which worked on my private network without an internet connection.
Follow the instructions to install the web socket server on e.g. 192.168.1.101, then have both end-points connect to 192.168.0.101:1337 with Chrome or Firefox. Share camera on both ends in the basic demo web UI, and hit Connect and you should be good to go.
If you need to do this entirely without any server, then this answer to a related question at least highlights the information you'd need to send across (in a cut'n'paste demo).

Peer to Peer, Javascript Games

I am writing a simple javascript game for a webpage. I am going to convert it to the desktop using tidesdk. I would like to allow players on different machines to play each other without the need to communicate through a server.
Is this possible in general? Is this Sockets?? Do you have any links of this being done with javascript code?
Is this possible with TideSdk? Do you know of any links to examples of this being done wiht TideSdk?
How do the players know what ip address/port their machine is on so they can give it to the other player?
I am sorry these are vague and open questions, but I don't really know where to start looking for this stuff, as I don't really know what the stuff I am looking for is called.
... Oh, and I don't want to use any third party stuff if I can help it. Maybe the jquery at a push.
This would be impossible with the APIs provided by web browsers (you would need to use something like Socket.IO and communicate through a server, as others have said). Fortunately, since you are using TideSDK, it is possible as long as you don't need a lot of network efficiency. You will need to provide a server, but it will not have to be powerful enough to host the actual games.
The General Client and Server Method
There are other ways to organize a network, but you can look those up if you think they'd be easier to implement.
Your server will host the actual game download and provide matchmaking capabilities. The clients that people download will contact this matchmaking server to find others who want to play.
The matchmaking server should select one of those clients to be a host for the others. Finally, the matchmaking server will tell the client selected as a host that it is the host and give it everyone's connection information (ports and IP addresses) while giving the other clients the connection information for the selected host. The host will connect to the other clients.
The host computer will be the only one that actually does any processing of gameplay, and the other clients just display whatever information the host sends them. The clients render the current state of the game from each player's perspective on their respective computers and capture user input, which is sent to the host for processing.
Implementation
TideSDK provides a Ti.Network.TCPSocket object which can make raw TCP client connections to TCP servers. Unfortunately, it does not also provide a way to make raw TCP servers. Instead, TideSDK provides a Ti.Network.HTTPServer object, which implements the HTTP protocol server over TCP, and a Ti.Network.HTTPClient object, which provides an HTTP client (it is actually just an abstraction over the normal AJAX request API). You can use the provided HTTP server on the host computer and directly connect to it on the clients using the provided HTTP clients. Data will be exchanged using the HTTP protocol. As far as I can tell, this is your only option here.
I did not find any example code out there (beyond what is in the TideSDK documentation) but you might find some if you are really interested.
Next Steps
If I wanted to go ahead with using TideSDK, I would do the following:
Tell the developers of TideSDK that you are interested in a TCP server socket. A raw TCP connection would be much faster than HTTP.
Test out the HTTP connection and find out if it is fast enough for my game.
Yes it's possible in general, and sockets are what you need. Although I don't think it's possible in practice, here's why.
Normally in a P2P game, there would be a server that knows who is online, and what their IP is. When new players connect to the server they will see a list of other users, they can select who they want to play.
Without having the server, there will be no way for users to see who is online, and to answer your 3rd question:
How do the players know what ip address/port their machine is on so they can give it to the other player? It doesn't matter if they can find their own IP, they have no way to find the IP of the opponent (without calling them on the phone :)).
So, if you want to build a game, then you'll need a server. I suggest Node.JS alongside Socket.IO

'Web' based push notifications for internal-only application

I'm already tossing around a solution but as I haven't done something like this before I wanted to check what SO thought before implementation.
Basically I need to modify an existing web based application that has approximately 20 users to add push notifications. It is important that the users get the notifications at the same time (PC-A shouldn't get an alert 20 seconds before PC-B). Currently the system works off of AJAX requests, sending to the server every 20 seconds and requesting any updates and completely rebuilding the table of data each time (even if data hasn't changed). This seems really sloppy so there's two methods I've come up with.
Don't break the connection from server-client. This idea I'm tossing around involves keeping the connection between server and client active the entire time. Bandwidth isn't really an issue with any solution as this is in an internal network for only approximately 20 people. With this solution the server could push Javascript to the client whenever there's an update and modify the table of data accordingly. Again, it's very important that every connected PC receives the updates as close to the same time as possible. The main drawback to this is my experience, I've never done it before so I'm not sure how well it'd work or if it's just generally a bad idea.
Continue with the AJAX request, but only respond in intervals. A second solution I've thought of would be to allow the clients to make AJAX requests as per usual (currently every 20 seconds) but have the server only respond in 30 second intervals (eg 2:00:00 and 2:00:30 regardless of how many AJAX requests it recieves in that span of time). This would require adjusting the timeout for the AJAX request to prevent the request timing out, but it sounds okay in theory, at least to me.
This is for an internal network only, so bandwidth isn't the primary concern, more so that the notification is received as close to each other as possible. I'm open to other ideas, those are just the two that I have thought of so far.
Edit
Primarily looking for pros and cons of each approach. DashK has another interesting approach but I'm wondering if anyone has experience with any of these methods and can attest to the strengths and weaknesses of each approach, or possibly another method.
If I understand well your needs I think you should take a look to Comet
Comet is a web application model in which a long-held HTTP request allows a web server to push data to a browser, without the browser explicitly requesting it. Comet is an umbrella term, encompassing multiple techniques for achieving this interaction. All these methods rely on features included by default in browsers, such as JavaScript, rather than on non-default plugins.
The Comet approach differs from the original model of the web, in which a browser requests a complete web page at a time.
How about using an XMPP server to solve the problem?
Originally designed to be an Instant Messaging platform, XMPP is a messaging protocol that enables users in the system to exchange messages. (There's more to this - But let's keep it simple.)
Let's simplify the scenario a little bit. Imagine the following:
You're a system admin. When the system
has a problem, you need to let all the
employees, about 20 of them, know that
the system is down.
In the old days, every employee will
ask you, "Is the system up?" every
hour or so, and you'll response
passively. While this works, you are
overloaded - Not by fixing system
outage, but by 20 people asking for
system status every hour.
Now, AIM is invented! Since every
employee has access to AIM, you
thought, "Hey, how about having every
single one of them join a 'System
Status' chat room, and I'll just send
a message to the room when the system
is down (or is back)?" By doing so,
employees who are interested in
knowing system status will simply join
the 'System Status' room, and will be
notified of system status update.
Back to the problem we're trying to solve...
System admin = "System" who wants to notify the web app users.
Employees = Web app users who wants to receive notification.
System Status chat room = Still, system Status chat room
When web app user signs on to your web app, make the page automatically logs them onto the XMPP server, and join the system status chat room.
When system wants to notify the user, write code to logon to the XMPP server, join the chat room, and broadcast a message to the room.
By using XMPP, you don't have to worry about:
Setting up "Lasting connection" - Some open source XMPP server, eJabberd/OpenFire, has built-in support for BOSH, XMPP's implementation of the Comet model.
How the message is delivered
You however will need the following:
Find a Javascript library that can help you to logon to an XMPP server. (Just Google. There're a lot.)
Find a XMPP library for the server-side code. (XMPP library exists for both Java & C#. But I'm not sure what system you're using behind the scene.)
Manually provision each user on the XMPP server (Seems like you only have 20 people. That should be easy - However, if the group grows bigger, you may want to perform auto-provisioning - Which is achievable through client-side Javascript XMPP library.)
As far as long-lasting AJAX calls, this implementation is limited by the at-most-2-connection-to-the-same-domain issue. If you used up one connection for this XMPP call, you only have 1 more connection to perform other AJAX calls in the web-app. Depending on how complex your webapp is, this may or may not be desirable, since if 2 AJAX calls have already been made, any subsequent AJAX call will have to wait until one of the AJAX pipeline freed up, which may cause "slowness" on your app.
You can fix this by converting all AJAX calls into XMPP messages, and have a bot-like user on the server to listen to those messages, and response to it by, say, sending back HTML snippets/JSON objects with the data. This however might be too much for what you're trying to achieve.
Ahh. Hope this makes sense... or not. :p
See http://ajaxpatterns.org/HTTP_Streaming
It allows You to push data from the server when server wants it. Not just after the query.
You could use this technique without making large changes to the current application, and synchronize output by the time on the server.
In addition to the other two great options above, you could look at Web Workers if you know they have latest Chrome, Safari, FF, or Opera for a browser.
A Worker has the added benefit of not operating in the same thread as the rest of the page, so performance will be better. The downside is that, for security purposes, you can only send string data between the two scripts and the worker does not have window or document context. However, JSON can be represented as a string, so there's really no limit to the data.
Workers can receive data multiple times and asynchronously. You set the onmessage handler to act each time it receives something.
If you can ask every user to use a specific browser (Latest Safari or Chrome), you can try WebSockets too.

Categories